How it works
DunetraceCallbackHandler plugs into LangChain's callback system and translates LangChain events into Dunetrace events automatically.
| LangChain event | Dunetrace event |
|---|---|
on_chain_start (outermost) | RUN_STARTED |
on_chat_model_start / on_llm_start | LLM_CALLED |
on_llm_end | LLM_RESPONDED (tokens + latency) |
on_tool_start / on_agent_action | TOOL_CALLED |
on_tool_end | TOOL_RESPONDED |
on_retriever_start | RETRIEVAL_CALLED |
on_retriever_end | RETRIEVAL_RESPONDED |
on_chain_end (outermost) | RUN_COMPLETED |
on_*_error | RUN_ERRORED |
Install
pip install 'dunetrace[langchain]'
# plus your LangChain stack
pip install langchain-openai langgraph
Create the handler
Instantiate once at startup, reuse across all invocations. The handler is thread-safe.
from dunetrace import Dunetrace
from dunetrace.integrations.langchain import DunetraceCallbackHandler
dt = Dunetrace(
endpoint="https://your-dunetrace-ingest",
api_key="dt_live_...",
)
callback = DunetraceCallbackHandler(
dt,
agent_id="my-langchain-agent", # must match the api_keys row
system_prompt=SYSTEM_PROMPT, # used for version hash
model="gpt-4o",
tools=["web_search", "calculator"],
)
Constructor parameters
| Parameter | Required | Description |
|---|---|---|
client | Yes | Your Dunetrace instance |
agent_id | Yes | Must match your api_keys row |
system_prompt | No | Used to compute a version fingerprint |
model | No | Defaults to "unknown" |
tools | No | Used by detectors like TOOL_AVOIDANCE |
LangGraph (create_react_agent)
from langchain_openai import ChatOpenAI
from langchain.tools import tool
from langgraph.prebuilt import create_react_agent
llm = ChatOpenAI(model="gpt-4o", temperature=0)
@tool
def web_search(query: str) -> str:
"""Search the web for information on a topic."""
return f"Results for: {query}"
agent = create_react_agent(llm, [web_search], prompt=SYSTEM_PROMPT)
result = agent.invoke(
{"messages": [("human", "What is 42 * 17?")]},
config={"callbacks": [callback]}, # ← add this
)
Async
result = await agent.ainvoke(
{"messages": [("human", "What is 42 * 17?")]},
config={"callbacks": [callback]},
)
The same handler works for both invoke() and ainvoke().
LangChain AgentExecutor (older pattern)
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", SYSTEM_PROMPT),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)
result = executor.invoke(
{"input": "What is the capital of France?"},
config={"callbacks": [callback]},
)
RAG agents
If your agent uses a LangChain retriever, retrieval events are captured automatically — no extra code.
from langchain.tools.retriever import create_retriever_tool
retriever_tool = create_retriever_tool(
retriever, "search_docs", "Search product documentation."
)
agent = create_react_agent(llm, [retriever_tool], prompt=SYSTEM_PROMPT)
result = agent.invoke(
{"messages": [("human", "How do I configure feature X?")]},
config={"callbacks": [callback]},
)
The handler extracts result_count and top_score from document metadata fields (score, relevance_score, or similarity). These feed the RAG_EMPTY_RETRIEVAL detector.
Concurrency
One handler can be shared across concurrent invoke() calls. Each invocation is tracked by LangChain's root run_id.
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as pool:
futures = [
pool.submit(agent.invoke, {"messages": [("human", q)]}, {"callbacks": [callback]})
for q in queries
]
results = [f.result() for f in futures]
Stale runs (invocations that never completed after 30 minutes) are pruned automatically on each new on_chain_start.
Shutdown
import atexit
atexit.register(dt.shutdown)
# or
dt.shutdown(timeout=5)
What is and isn't captured
Captured automatically:
- Every LLM call: model, token counts (prompt + completion), latency, finish reason
- Every tool call: tool name, success/failure, output length
- Every retriever call: index name, result count, top similarity score
- Run-level: total steps, tool call count, exit reason
Never captured (hashed in-process): user input, prompts, completions, tool arguments, tool outputs, error messages, retrieval queries.
Connect Langfuse for deep analysis
If you run Langfuse alongside Dunetrace, you can wire them together so that when a signal fires the dashboard offers an "Explain with Langfuse ↗" button that pulls the actual trace and produces a specific root-cause explanation.
How the IDs align
DunetraceCallbackHandler sets its run_id from the LangChain root run_id. Langfuse v4 independently uses the same LangChain root run_id as its trace_id, but in 32-character hex format (no dashes). They represent the same run.
| System | ID format | Example |
|---|---|---|
Dunetrace run_id | UUID with dashes | b5ed23be-e4f0-43bc-8625-... |
Langfuse trace_id | 32-char hex, no dashes | b5ed23bee4f043bc8625... |
The Dunetrace API normalises the format automatically when querying Langfuse.
Access both IDs after a run
from langfuse.langchain import CallbackHandler as LangfuseCallbackHandler # v4+
lf_cb = LangfuseCallbackHandler() # reads LANGFUSE_* from env
result = agent.invoke(
{"messages": [("human", query)]},
config={"callbacks": [callback, lf_cb]},
)
dt.shutdown(timeout=5)
import langfuse as lf_module
lf_module.get_client().flush() # ensure trace is uploaded
dt_run_id = callback.last_run_id # Dunetrace run ID
lf_trace_id = lf_cb.last_trace_id # Langfuse trace ID
Call the explain endpoint
POST /v1/signals/{signal_id}/explain
Authorization: Bearer <key>
Content-Type: application/json
{"langfuse_trace_id": "<lf_trace_id>"}
See All integrations → Langfuse for the full setup — credentials, .env vars, apply-fix and fix-status endpoints.
Troubleshooting
No runs appear in the dashboard
- Call
dt.shutdown()— events are flushed by the drain thread; without shutdown some may be lost on process exit. - Confirm the
api_keymatches anactive = TRUErow inapi_keys. - Pass
debug=TruetoDunetrace()for verbose logs.
Token counts missing
Some providers and LangChain versions don't populate token_usage in llm_output. The handler falls back to usage_metadata on the message. If both are absent, token fields are omitted — expected behavior, detectors still run.
Detectors fire too aggressively
Tune detectors.yml on the server — see threshold tuning.