What each detector catches
Fifteen structural detectors run automatically on every completed run. All thresholds are configurable via detectors.yml — no code changes, no rebuild.
| Detector | Trigger | Severity |
|---|---|---|
PROMPT_INJECTION_SIGNAL | Input matches known injection / jailbreak patterns | CRIT |
TOOL_LOOP | Same tool called ≥3× in a 5-tool-call window | HIGH |
TOOL_THRASHING | Agent alternates between exactly two tools | HIGH |
LLM_TRUNCATION_LOOP | finish_reason=length fires ≥2 times | HIGH |
RETRY_STORM | Same tool fails 3+ times in a row | HIGH |
EMPTY_LLM_RESPONSE | Zero-length output with finish_reason=stop | HIGH |
CASCADING_TOOL_FAILURE | 3+ consecutive failures across 2+ distinct tools | HIGH |
SLOW_STEP | Tool call >15s or LLM call >30s | MED/HIGH |
CONTEXT_BLOAT | Prompt tokens grow 3× from first to last LLM call | MED |
GOAL_ABANDONMENT | Tool use stops, then ≥4 consecutive LLM calls | MED |
REASONING_STALL | LLM:tool ratio ≥4× — thinking without acting | MED |
STEP_COUNT_INFLATION | Run used >2× P75 step count for this agent | MED |
FIRST_STEP_FAILURE | Error or empty output at step ≤2 | MED |
RAG_EMPTY_RETRIEVAL | Retrieval returned 0 results, agent answered anyway | MED |
TOOL_AVOIDANCE | Final answer without calling available tools | MED |
STEP_COUNT_INFLATION needs a warm baseline. P75 is computed from the last 50 successfully completed runs for the same agent_id + agent_version. The detector stays silent until at least 10 such runs exist, then activates automatically.Tuning thresholds
Edit detectors.yml in the repo root.
default:
tool_loop:
threshold: 2 # fire if same tool called ≥N times in window
context_bloat:
growth_factor: 4.0 # last/first prompt token ratio to trigger
web-research:
tool_loop:
threshold: 5 # search agents legitimately repeat queries
Named sections match agent_id and inherit from default, overriding only what you specify. Restart the detector to apply:
docker compose restart detector
Shadow mode
Every signal is stored with a shadow flag. The alerts worker only delivers signals where shadow = false.
All 15 built-in detectors ship live. Custom detectors start in shadow mode — signals stored and visible in the dashboard, but no Slack or webhook alert fires — until you add them to LIVE_DETECTORS:
# services/detector/detector_svc/db.py
LIVE_DETECTORS: set[str] = {
"TOOL_LOOP",
"YOUR_NEW_DETECTOR", # promote once precision > 80%
…
}
This lets you validate a new detector against real traffic before it pages anyone.
Shadow signals in the dashboard
The Alerts page surfaces shadow signals in a dedicated section below the live alert groups. Dashed border, reduced opacity, SHADOW badge. The section only appears when at least one shadow signal exists.
curl "http://localhost:8002/v1/agents/my-agent/signals?include_shadow=true" \
-H "Authorization: Bearer dt_dev_test"
How detection works
- Detector worker polls Postgres every 5 seconds for completed or stalled runs
- Fetches all events for that run from
events - Replays them into a
RunState— tool calls, LLM calls, retrievals, durations - Runs all 15 detectors against the state
- Writes any triggered
FailureSignalrows - Marks the run processed in
processed_runs
Detection adds zero latency to the agent — it runs entirely after the run completes.