LangChain / LangGraph — Dunetrace docs

One callback. Zero changes to your agent logic. DunetraceCallbackHandler plugs into the LangChain callback system and translates every event automatically.

How it works

DunetraceCallbackHandler plugs into LangChain's callback system and translates LangChain events into Dunetrace events automatically.

LangChain event	Dunetrace event
`on_chain_start` (outermost)	`RUN_STARTED`
`on_chat_model_start` / `on_llm_start`	`LLM_CALLED`
`on_llm_end`	`LLM_RESPONDED` (tokens + latency)
`on_tool_start` / `on_agent_action`	`TOOL_CALLED`
`on_tool_end`	`TOOL_RESPONDED`
`on_retriever_start`	`RETRIEVAL_CALLED`
`on_retriever_end`	`RETRIEVAL_RESPONDED`
`on_chain_end` (outermost)	`RUN_COMPLETED`
`on_*_error`	`RUN_ERRORED`

Install

pip install 'dunetrace[langchain]'

# plus your LangChain stack
pip install langchain-openai langgraph

Create the handler

Instantiate once at startup, reuse across all invocations. The handler is thread-safe.

from dunetrace import Dunetrace
from dunetrace.integrations.langchain import DunetraceCallbackHandler

dt = Dunetrace(
    endpoint="https://your-dunetrace-ingest",
    api_key="dt_live_...",
)

callback = DunetraceCallbackHandler(
    dt,
    agent_id="my-langchain-agent",   # must match the api_keys row
    system_prompt=SYSTEM_PROMPT,      # used for version hash
    model="gpt-4o",
    tools=["web_search", "calculator"],
)

Constructor parameters

Parameter	Required	Description
`client`	Yes	Your `Dunetrace` instance
`agent_id`	Yes	Must match your `api_keys` row
`system_prompt`	No	Used to compute a version fingerprint
`model`	No	Defaults to `"unknown"`
`tools`	No	Used by detectors like `TOOL_AVOIDANCE`

LangGraph (create_react_agent)

from langchain_openai import ChatOpenAI
from langchain.tools import tool
from langgraph.prebuilt import create_react_agent

llm = ChatOpenAI(model="gpt-4o", temperature=0)

@tool
def web_search(query: str) -> str:
    """Search the web for information on a topic."""
    return f"Results for: {query}"

agent = create_react_agent(llm, [web_search], prompt=SYSTEM_PROMPT)

result = agent.invoke(
    {"messages": [("human", "What is 42 * 17?")]},
    config={"callbacks": [callback]},   # ← add this
)

Async

result = await agent.ainvoke(
    {"messages": [("human", "What is 42 * 17?")]},
    config={"callbacks": [callback]},
)

The same handler works for both invoke() and ainvoke().

LangChain AgentExecutor (older pattern)

from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", SYSTEM_PROMPT),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

result = executor.invoke(
    {"input": "What is the capital of France?"},
    config={"callbacks": [callback]},
)

RAG agents

If your agent uses a LangChain retriever, retrieval events are captured automatically — no extra code.

from langchain.tools.retriever import create_retriever_tool

retriever_tool = create_retriever_tool(
    retriever, "search_docs", "Search product documentation."
)
agent = create_react_agent(llm, [retriever_tool], prompt=SYSTEM_PROMPT)
result = agent.invoke(
    {"messages": [("human", "How do I configure feature X?")]},
    config={"callbacks": [callback]},
)

The handler extracts result_count and top_score from document metadata fields (score, relevance_score, or similarity). These feed the RAG_EMPTY_RETRIEVAL detector.

Concurrency

One handler can be shared across concurrent invoke() calls. Each invocation is tracked by LangChain's root run_id.

import concurrent.futures

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as pool:
    futures = [
        pool.submit(agent.invoke, {"messages": [("human", q)]}, {"callbacks": [callback]})
        for q in queries
    ]
    results = [f.result() for f in futures]

Stale runs (invocations that never completed after 30 minutes) are pruned automatically on each new on_chain_start.

Shutdown

import atexit
atexit.register(dt.shutdown)
# or
dt.shutdown(timeout=5)

What is and isn't captured

Captured automatically:

Every LLM call: model, token counts (prompt + completion), latency, finish reason
Every tool call: tool name, success/failure, output length
Every retriever call: index name, result count, top similarity score
Run-level: total steps, tool call count, exit reason

Never captured (hashed in-process): user input, prompts, completions, tool arguments, tool outputs, error messages, retrieval queries.

Connect Langfuse for deep analysis

If you run Langfuse alongside Dunetrace, you can wire them together so that when a signal fires the dashboard offers an "Explain with Langfuse ↗" button that pulls the actual trace and produces a specific root-cause explanation.

How the IDs align

DunetraceCallbackHandler sets its run_id from the LangChain root run_id. Langfuse v4 independently uses the same LangChain root run_id as its trace_id, but in 32-character hex format (no dashes). They represent the same run.

System	ID format	Example
Dunetrace `run_id`	UUID with dashes	`b5ed23be-e4f0-43bc-8625-...`
Langfuse `trace_id`	32-char hex, no dashes	`b5ed23bee4f043bc8625...`

The Dunetrace API normalises the format automatically when querying Langfuse.

Access both IDs after a run

from langfuse.langchain import CallbackHandler as LangfuseCallbackHandler  # v4+

lf_cb = LangfuseCallbackHandler()   # reads LANGFUSE_* from env

result = agent.invoke(
    {"messages": [("human", query)]},
    config={"callbacks": [callback, lf_cb]},
)

dt.shutdown(timeout=5)

import langfuse as lf_module
lf_module.get_client().flush()   # ensure trace is uploaded

dt_run_id   = callback.last_run_id    # Dunetrace run ID
lf_trace_id = lf_cb.last_trace_id    # Langfuse trace ID

Call the explain endpoint

POST /v1/signals/{signal_id}/explain
Authorization: Bearer <key>
Content-Type: application/json

{"langfuse_trace_id": "<lf_trace_id>"}

See All integrations → Langfuse for the full setup — credentials, .env vars, apply-fix and fix-status endpoints.

Troubleshooting

No runs appear in the dashboard

Call dt.shutdown() — events are flushed by the drain thread; without shutdown some may be lost on process exit.
Confirm the api_key matches an active = TRUE row in api_keys.
Pass debug=True to Dunetrace() for verbose logs.

Token counts missing

Some providers and LangChain versions don't populate token_usage in llm_output. The handler falls back to usage_metadata on the message. If both are absent, token fields are omitted — expected behavior, detectors still run.

Detectors fire too aggressively

Tune detectors.yml on the server — see threshold tuning.