Docs / Instrument · All

All integrations

FastAPI, Flask, Django, OpenTelemetry, OpenLLMetry, Langfuse, Grafana Loki, Tempo, Honeycomb, Datadog. Side-by-side setup for each transport.

Langfuse — deep analysis

What you get: when a signal fires (e.g. TOOL_LOOP, GOAL_ABANDONMENT), the dashboard shows an "Explain with Langfuse ↗" button. Clicking it fetches the execution trace from Langfuse, runs an LLM analysis against the signal evidence + trace inputs/outputs, and returns a plain-English root cause with a specific prompt fix you can apply in one click.

Prerequisites

  • Langfuse account (cloud or self-hosted) with a project and API keys
  • One LLM API key for the analysis call (ANTHROPIC_API_KEY preferred, OPENAI_API_KEY accepted as fallback)

1. Install

pip install 'dunetrace[langchain,langfuse]'

2. Add credentials to .env

LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com   # omit for cloud; set for self-hosted

ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...                    # accepted as fallback

Restart the API container after editing .env:

docker compose up -d api

3. Run both callbacks together

DunetraceCallbackHandler.last_run_id exposes the Dunetrace run ID. Langfuse's last_trace_id gives the corresponding Langfuse trace ID. Pass it to the explain endpoint to join the two systems.

from dunetrace import Dunetrace
from dunetrace.integrations.langchain import DunetraceCallbackHandler
from langfuse.langchain import CallbackHandler as LangfuseCallbackHandler  # v4+

dt = Dunetrace()
dt_cb = DunetraceCallbackHandler(dt, agent_id="my-agent", model="gpt-4o-mini")
lf_cb = LangfuseCallbackHandler()   # reads LANGFUSE_* from env

result = agent.invoke(
    {"messages": [("human", query)]},
    config={"callbacks": [dt_cb, lf_cb]},
)

dt.shutdown(timeout=5)

import langfuse as lf_module
lf_module.get_client().flush()   # ensure trace is uploaded

dt_run_id   = dt_cb.last_run_id    # e.g. "b5ed23be-e4f0-43bc-..."
lf_trace_id = lf_cb.last_trace_id  # e.g. "b5ed23bee4f043bc..." (same UUID, no dashes)
Langfuse v4 uses OTel-style 32-char hex IDs (no dashes). The Dunetrace API strips dashes automatically when querying Langfuse, so dt_run_id and lf_trace_id represent the same run even though their formats differ.

4. Call the explain endpoint

POST /v1/signals/{signal_id}/explain
Content-Type: application/json
Authorization: Bearer <your-key>

{
  "langfuse_trace_id": "b5ed23bee4f043bc8625914223875508"
}

Response includes root_cause, fix_content, fix_type, apply_blocked, and langfuse_prompt_name.

fix_typeMeaningapply_blocked
prompt_additionOne sentence to append to the system promptfalse — apply button shown
code_changeCode or infrastructure fix needed (CONTEXT_BLOAT, SLOW_STEP, etc.)true — apply button hidden
no_auto_applySecurity signal (PROMPT_INJECTION_SIGNAL) — never auto-applytrue — blocked at API level

5. Apply a fix to a managed prompt

When langfuse_prompt_name is non-null and apply_blocked is false:

POST /v1/signals/{signal_id}/apply-fix
Content-Type: application/json
Authorization: Bearer <your-key>

{
  "fix_content": "Do not repeat a search query you have already executed in this run.",
  "langfuse_prompt_name": "research-agent-prompt"
}

The fix is appended to the current prompt text and published as a new version. The dashboard shows "Applied as v4 in Langfuse ↗" with a link.

6. Track fix effectiveness

GET /v1/signals/{signal_id}/fix-status
Authorization: Bearer <your-key>

Returns runs_after_fix, recurrences_after_fix, and a verdict: verified (≥10 runs, 0 recurrences), likely_fixed (≥5 runs, 0 recurrences), still_occurring, or insufficient_data.

What each tool sees

ConcernDunetraceLangfuse
Raw prompts & completionsNever — SHA-256 hashedYes — full content
Structural failures (loops, stalls…)Automatic, 15 detectorsManual inspection
Proactive alertingSlack / webhook in <15sNo — passive
Prompt fix workflowExplain + apply in one clickManual editing
Trace timelineStep graph (hashed)Full span tree with content

The Langfuse trace is never stored by Dunetrace — fetched, analysed, discarded. See the LangChain guide for the full runnable example with both callbacks.


OpenTelemetry

pip install 'dunetrace[otel]' opentelemetry-exporter-otlp-proto-grpc
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from dunetrace.integrations.otel import DunetraceOTelExporter

resource = Resource.create({
    "service.name": "my-agent-service",
    "deployment.environment": "production",
})
provider = TracerProvider(resource=resource)
provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter()))

dt = Dunetrace(otel_exporter=DunetraceOTelExporter(provider))

Each agent run produces a trace with a deterministic trace_id derived from run_id. Failure signals at run end are written as indexed attributes on the root span (dunetrace.signal.0.failure_type, .severity, .confidence). HIGH / CRITICAL signals set span.status = ERROR.

OpenLLMetry / OTel receiver

OpenLLMetry instruments 40+ AI frameworks and emits standard gen_ai.* OTel spans. Add DunetraceOTelReceiver as a second exporter and get behavioral detection with zero agent-code changes.

pip install 'dunetrace[otel]'
from dunetrace.integrations.otel_receiver import DunetraceOTelReceiver

DunetraceOTelReceiver.attach(provider, dt, agent_id="my-agent")

from traceloop.sdk import Traceloop
Traceloop.init(app_name="my-agent", tracer_provider=provider)

Each OTel trace becomes one Dunetrace run. Spans with gen_ai.request.model translate to llm_called / llm_responded. Spans with gen_ai.tool.name become tool_called / tool_responded.

gen_ai.* attribute handling

AttributeHandling
gen_ai.request.modelPassed as-is (not sensitive)
gen_ai.usage.prompt_tokensPassed as-is
gen_ai.usage.completion_tokensPassed as-is
gen_ai.completion.0.finish_reasonPassed as-is
gen_ai.tool.namePassed as-is
gen_ai.prompt / gen_ai.completionSHA-256 hashed at receiver boundary
gen_ai.prompt.0.contentSHA-256 hashed at receiver boundary

FastAPI / ASGI

from fastapi import FastAPI
from dunetrace import Dunetrace, DunetraceASGIMiddleware, get_current_run

dt = Dunetrace()
dt.auto_instrument()

app = FastAPI()
app.add_middleware(
    DunetraceASGIMiddleware,
    dt=dt, agent_id="my-api", model="gpt-4o",
)

@app.post("/chat")
async def chat(query: str):
    run = get_current_run()        # run opened by middleware
    resp = await openai_client.chat.completions.create(
        model="gpt-4o", messages=[{"role": "user", "content": query}],
    )
    return resp.choices[0].message.content

The run is also available on request.state.dunetrace_run.

Flask / WSGI

from flask import Flask
from dunetrace import Dunetrace, DunetraceWSGIMiddleware

dt = Dunetrace()
dt.auto_instrument()

app = Flask(__name__)
app.wsgi_app = DunetraceWSGIMiddleware(app.wsgi_app, dt=dt, agent_id="my-api")

Django

# wsgi.py
from dunetrace import Dunetrace, DunetraceWSGIMiddleware
from django.core.wsgi import get_wsgi_application

dt = Dunetrace()
dt.auto_instrument()
application = DunetraceWSGIMiddleware(get_wsgi_application(), dt=dt, agent_id="django-api")

Auto-instrumentation

dt.auto_instrument() patches supported AI framework clients at the class level so every LLM call made inside a dt.run() context (or inside a @dt.agent() function or middleware-wrapped request) is tracked automatically.

Supported frameworks: openai, anthropic, httpx, requests. Uninstalled frameworks are silently skipped. Calling auto_instrument() more than once is safe.

dt.auto_instrument()                         # patch all installed
dt.auto_instrument(["openai", "anthropic"])  # LLM clients only
dt.auto_instrument(["httpx", "requests"])    # HTTP clients only

Grafana / Loki

dt = Dunetrace(emit_as_json=True)

Writes every event to stdout as a Loki-compatible NDJSON line. Each line includes ts, level, logger, event_type, agent_id, run_id, step_index, payload.

Minimal Promtail pipeline stage:

pipeline_stages:
  - json:
      expressions: {ts: ts, event_type: event_type, agent_id: agent_id}
  - timestamp:
      source: ts
      format: RFC3339Nano
  - labels:
      agent_id:
      event_type:

get_current_run()

Returns the active RunContext for the current async task or thread, or None. Works inside @dt.agent(), ASGI middleware, WSGI middleware, and direct dt.run().

from dunetrace import get_current_run

def some_helper():
    run = get_current_run()
    if run:
        run.tool_called("cache_lookup")
        result = cache.get(key)
        run.tool_responded("cache_lookup", success=result is not None)
        return result

Policies

Runtime guardrails evaluated mid-run after every tool_called, llm_responded, and tool_responded event. Policies fire at most once per run (except log policies, which fire every time).

Local policies (no backend required)

from dunetrace import Dunetrace, PolicyViolation

dt = Dunetrace()

# Stop the run if tool calls exceed 5
dt.add_policy(
    name="cap tool calls",
    condition={"trigger": "tool_call_count", "operator": "gt", "value": 5},
    action={"type": "stop"},
)

# Downgrade model when estimated cost exceeds $0.50
dt.add_policy(
    name="cost cap",
    condition={"trigger": "cost_usd", "operator": "gt", "value": 0.50},
    action={"type": "switch_model", "params": {"model": "gpt-4o-mini"}},
)

# Inject a corrective prompt when a loop is detected mid-run
dt.add_policy(
    name="loop fix",
    condition={"trigger": "signal", "operator": "eq", "value": "TOOL_LOOP"},
    action={"type": "inject_prompt", "params": {"prompt": "Stop repeating tool calls. Summarise what you know and answer directly."}},
)

Remote policies (dashboard-managed)

When api_key and endpoint are set, the SDK fetches policies from the backend at run start and caches them for 60 seconds. Policies defined in the dashboard apply automatically — no code changes needed.

dt = Dunetrace(api_key="dt_live_...", endpoint="https://ingest.dunetrace.com")
# Policies defined in the Policies page are pulled at run start.

Local policies (added via add_policy()) take priority over remote ones at the same priority level.

Condition reference

TriggerTypeWhat it measures
tool_call_countintTotal tool calls in the run so far
step_countintCurrent step index
cost_usdfloatAccumulated LLM cost in USD (model-aware pricing)
error_countintFailed tool calls (success=False)
finish_reasonstrLatest LLM finish_reason (e.g. "length", "stop")
llm_latency_msintLatest LLM call latency in milliseconds
signalstrDetector signal name — runs the full detector suite lazily (e.g. "TOOL_LOOP")

Supported operators: gt gte lt lte eq neq contains

Action reference

Action typeEffect
stopRaises PolicyViolation; run exits with exit_reason="policy_violation"
switch_modelSets run.model_override — read it between steps to switch the model
inject_promptAppends to run.prompt_additions — pop with run.pop_prompt_addition() and prepend to next LLM call
logEmits policy.triggered event; no interruption; fires on every matching event

Dashboard CRUD

Policies can be created, edited, toggled, and deleted from the Policies page at http://localhost:3000. The backend REST API:

EndpointDescription
GET /v1/policiesList all policies
POST /v1/policiesCreate a policy
PUT /v1/policies/{id}Replace a policy
DELETE /v1/policies/{id}Delete a policy
PATCH /v1/policies/{id}/toggleEnable / disable