Documentation

Run agents. Know when they break.

Dunetrace is runtime observability for AI agents. Fifteen behavioral detectors, deterministic explanations, Slack alerts in under fifteen seconds. These pages cover everything from a two-minute install to the database schema.

Start here
Quick start
Up and running in two minutes
Clone, docker compose up, instrument your agent, open the dashboard. Runs locally with no API key.
Architecture
How the pipeline works
Five services, one Postgres, one static dashboard. SDK → ingest → detector → explain → alerts. Failure modes included.
Integrate your agent
LangChain / LangGraph
One callback, zero agent changes
DunetraceCallbackHandler plugs into the LangChain callback system and translates every event automatically.
Custom Python agent
Decorator, middleware, or manual
Five paths: @dt.agent(), ASGI, WSGI, manual dt.run(), or OpenTelemetry receiver. Pick what fits.
TypeScript agent
No package required
Send events directly to the ingest HTTP endpoint from any TypeScript or Node.js agent. Same detectors and alerts as Python.
Integrations
FastAPI, Flask, OTel, Loki, Policies
OpenLLMetry, Grafana Loki, Tempo, Honeycomb, Datadog, runtime guardrail policies. Side-by-side setup for each.
Detectors
All 15 behavioral detectors
What each one catches, its threshold, how to tune detectors.yml, and shadow-mode evaluation.
Operate it
Dashboard
Mission control at :3000
Overview, Runs, Alerts, Analytics, Heatmap, Agents, Compare, Detectors. Auto-refreshes every 15s.
Alerts
Slack, webhook, weekly digest
Rate context, HMAC signatures, at-least-once delivery, and the Monday 9am UTC digest.
Policies
GitHub
Privacy, Terms & Code of Conduct
All policy documents live in the repository — privacy architecture, Apache 2.0 terms, and community standards.
FAQ
How is Dunetrace different from LangSmith or Langfuse?
LangSmith and Langfuse answer "what happened?" after you already know something broke — they store full traces for forensic investigation. Dunetrace answers a different question: "is something breaking right now?" It watches the structural pattern of every run and fires a Slack alert within 15 seconds. They're complementary: LangSmith tells you what happened in the run Dunetrace flagged.
Does Dunetrace store my prompts or agent outputs?
No. Every prompt, tool argument, and model output is SHA-256 hashed in-process before any data leaves your agent. The database schema has no column that could store raw content — it's structurally impossible, not just a policy. What does transmit: model names, token counts, latencies, tool names, finish reasons, and step counts.
What is the overhead added to my agent?
Under 500 μs per run with default HTTP ingest. The SDK appends events to an in-memory ring buffer (<1 μs) and a background drain thread ships them asynchronously. Even if the ingest API is down, your agent is never blocked — events buffer up to 10,000 and are shipped when the API recovers.
Which frameworks are supported?
LangChain and LangGraph (via DunetraceCallbackHandler), FastAPI and Starlette (ASGI middleware), Flask and Django (WSGI middleware), any OpenAI / Anthropic / httpx / requests-based agent (via auto_instrument()), and any OpenTelemetry pipeline emitting gen_ai.* spans (via DunetraceOTelReceiver).
What is shadow mode?
Shadow signals are stored and visible in the dashboard but never delivered to Slack or webhooks. All 15 built-in detectors ship live. Custom detectors start in shadow mode — you can validate precision against real traffic before adding them to LIVE_DETECTORS and having them page anyone.
Can I self-host Dunetrace?
Yes — it's designed for self-hosting. docker compose up brings up the full stack (ingest API, detector worker, alerts worker, customer API, dashboard, Postgres) inside your own infrastructure. No data leaves your network unless you configure a Slack webhook or external webhook destination.
How do I tune detectors to reduce false positives?
Edit detectors.yml in the repo root and restart the detector service with docker compose restart detector. Named sections match agent_id and inherit from default, so you can give search agents a higher tool_loop.threshold without affecting other agents. See the detector reference for all configurable thresholds.
What is the license?
Apache 2.0. Free to use, modify, and distribute for any purpose — commercial or otherwise. See the LICENSE file for the full text.
Something missing?
Open an issue on GitHub or email the team. Docs PRs welcome.
Open an issue ↗ Get in touch