Docs / Instrument · LangChain

LangChain / LangGraph

One callback. Zero changes to your agent logic. DunetraceCallbackHandler plugs into the LangChain callback system and translates every event automatically.

How it works

DunetraceCallbackHandler plugs into LangChain's callback system and translates LangChain events into Dunetrace events automatically.

LangChain eventDunetrace event
on_chain_start (outermost)RUN_STARTED
on_chat_model_start / on_llm_startLLM_CALLED
on_llm_endLLM_RESPONDED (tokens + latency)
on_tool_start / on_agent_actionTOOL_CALLED
on_tool_endTOOL_RESPONDED
on_retriever_startRETRIEVAL_CALLED
on_retriever_endRETRIEVAL_RESPONDED
on_chain_end (outermost)RUN_COMPLETED
on_*_errorRUN_ERRORED

Install

pip install 'dunetrace[langchain]'

# plus your LangChain stack
pip install langchain-openai langgraph

Create the handler

Instantiate once at startup, reuse across all invocations. The handler is thread-safe.

from dunetrace import Dunetrace
from dunetrace.integrations.langchain import DunetraceCallbackHandler

dt = Dunetrace(
    endpoint="https://your-dunetrace-ingest",
    api_key="dt_live_...",
)

callback = DunetraceCallbackHandler(
    dt,
    agent_id="my-langchain-agent",   # must match the api_keys row
    system_prompt=SYSTEM_PROMPT,      # used for version hash
    model="gpt-4o",
    tools=["web_search", "calculator"],
)

Constructor parameters

ParameterRequiredDescription
clientYesYour Dunetrace instance
agent_idYesMust match your api_keys row
system_promptNoUsed to compute a version fingerprint
modelNoDefaults to "unknown"
toolsNoUsed by detectors like TOOL_AVOIDANCE

LangGraph (create_react_agent)

from langchain_openai import ChatOpenAI
from langchain.tools import tool
from langgraph.prebuilt import create_react_agent

llm = ChatOpenAI(model="gpt-4o", temperature=0)

@tool
def web_search(query: str) -> str:
    """Search the web for information on a topic."""
    return f"Results for: {query}"

agent = create_react_agent(llm, [web_search], prompt=SYSTEM_PROMPT)

result = agent.invoke(
    {"messages": [("human", "What is 42 * 17?")]},
    config={"callbacks": [callback]},   # ← add this
)

Async

result = await agent.ainvoke(
    {"messages": [("human", "What is 42 * 17?")]},
    config={"callbacks": [callback]},
)

The same handler works for both invoke() and ainvoke().

LangChain AgentExecutor (older pattern)

from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", SYSTEM_PROMPT),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

result = executor.invoke(
    {"input": "What is the capital of France?"},
    config={"callbacks": [callback]},
)

RAG agents

If your agent uses a LangChain retriever, retrieval events are captured automatically — no extra code.

from langchain.tools.retriever import create_retriever_tool

retriever_tool = create_retriever_tool(
    retriever, "search_docs", "Search product documentation."
)
agent = create_react_agent(llm, [retriever_tool], prompt=SYSTEM_PROMPT)
result = agent.invoke(
    {"messages": [("human", "How do I configure feature X?")]},
    config={"callbacks": [callback]},
)

The handler extracts result_count and top_score from document metadata fields (score, relevance_score, or similarity). These feed the RAG_EMPTY_RETRIEVAL detector.

Concurrency

One handler can be shared across concurrent invoke() calls. Each invocation is tracked by LangChain's root run_id.

import concurrent.futures

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as pool:
    futures = [
        pool.submit(agent.invoke, {"messages": [("human", q)]}, {"callbacks": [callback]})
        for q in queries
    ]
    results = [f.result() for f in futures]

Stale runs (invocations that never completed after 30 minutes) are pruned automatically on each new on_chain_start.

Shutdown

import atexit
atexit.register(dt.shutdown)
# or
dt.shutdown(timeout=5)

What is and isn't captured

Captured automatically:

  • Every LLM call: model, token counts (prompt + completion), latency, finish reason
  • Every tool call: tool name, success/failure, output length
  • Every retriever call: index name, result count, top similarity score
  • Run-level: total steps, tool call count, exit reason

Never captured (hashed in-process): user input, prompts, completions, tool arguments, tool outputs, error messages, retrieval queries.

Connect Langfuse for deep analysis

If you run Langfuse alongside Dunetrace, you can wire them together so that when a signal fires the dashboard offers an "Explain with Langfuse ↗" button that pulls the actual trace and produces a specific root-cause explanation.

How the IDs align

DunetraceCallbackHandler sets its run_id from the LangChain root run_id. Langfuse v4 independently uses the same LangChain root run_id as its trace_id, but in 32-character hex format (no dashes). They represent the same run.

SystemID formatExample
Dunetrace run_idUUID with dashesb5ed23be-e4f0-43bc-8625-...
Langfuse trace_id32-char hex, no dashesb5ed23bee4f043bc8625...

The Dunetrace API normalises the format automatically when querying Langfuse.

Access both IDs after a run

from langfuse.langchain import CallbackHandler as LangfuseCallbackHandler  # v4+

lf_cb = LangfuseCallbackHandler()   # reads LANGFUSE_* from env

result = agent.invoke(
    {"messages": [("human", query)]},
    config={"callbacks": [callback, lf_cb]},
)

dt.shutdown(timeout=5)

import langfuse as lf_module
lf_module.get_client().flush()   # ensure trace is uploaded

dt_run_id   = callback.last_run_id    # Dunetrace run ID
lf_trace_id = lf_cb.last_trace_id    # Langfuse trace ID

Call the explain endpoint

POST /v1/signals/{signal_id}/explain
Authorization: Bearer <key>
Content-Type: application/json

{"langfuse_trace_id": "<lf_trace_id>"}

See All integrations → Langfuse for the full setup — credentials, .env vars, apply-fix and fix-status endpoints.

Troubleshooting

No runs appear in the dashboard

  • Call dt.shutdown() — events are flushed by the drain thread; without shutdown some may be lost on process exit.
  • Confirm the api_key matches an active = TRUE row in api_keys.
  • Pass debug=True to Dunetrace() for verbose logs.

Token counts missing

Some providers and LangChain versions don't populate token_usage in llm_output. The handler falls back to usage_metadata on the message. If both are absent, token fields are omitted — expected behavior, detectors still run.

Detectors fire too aggressively

Tune detectors.yml on the server — see threshold tuning.