Docs / Reference · Detectors

Detectors

Fifteen structural detectors run automatically on every completed run. All thresholds configurable. Shadow mode lets you evaluate new detectors against real traffic before they page anyone.

What each detector catches

Fifteen structural detectors run automatically on every completed run. All thresholds are configurable via detectors.yml — no code changes, no rebuild.

DetectorTriggerSeverity
PROMPT_INJECTION_SIGNALInput matches known injection / jailbreak patternsCRIT
TOOL_LOOPSame tool called ≥3× in a 5-tool-call windowHIGH
TOOL_THRASHINGAgent alternates between exactly two toolsHIGH
LLM_TRUNCATION_LOOPfinish_reason=length fires ≥2 timesHIGH
RETRY_STORMSame tool fails 3+ times in a rowHIGH
EMPTY_LLM_RESPONSEZero-length output with finish_reason=stopHIGH
CASCADING_TOOL_FAILURE3+ consecutive failures across 2+ distinct toolsHIGH
SLOW_STEPTool call >15s or LLM call >30sMED/HIGH
CONTEXT_BLOATPrompt tokens grow 3× from first to last LLM callMED
GOAL_ABANDONMENTTool use stops, then ≥4 consecutive LLM callsMED
REASONING_STALLLLM:tool ratio ≥4× — thinking without actingMED
STEP_COUNT_INFLATIONRun used >2× P75 step count for this agentMED
FIRST_STEP_FAILUREError or empty output at step ≤2MED
RAG_EMPTY_RETRIEVALRetrieval returned 0 results, agent answered anywayMED
TOOL_AVOIDANCEFinal answer without calling available toolsMED
STEP_COUNT_INFLATION needs a warm baseline. P75 is computed from the last 50 successfully completed runs for the same agent_id + agent_version. The detector stays silent until at least 10 such runs exist, then activates automatically.

Tuning thresholds

Edit detectors.yml in the repo root.

default:
  tool_loop:
    threshold: 2        # fire if same tool called ≥N times in window
  context_bloat:
    growth_factor: 4.0  # last/first prompt token ratio to trigger

web-research:
  tool_loop:
    threshold: 5        # search agents legitimately repeat queries

Named sections match agent_id and inherit from default, overriding only what you specify. Restart the detector to apply:

docker compose restart detector

Shadow mode

Every signal is stored with a shadow flag. The alerts worker only delivers signals where shadow = false.

All 15 built-in detectors ship live. Custom detectors start in shadow mode — signals stored and visible in the dashboard, but no Slack or webhook alert fires — until you add them to LIVE_DETECTORS:

# services/detector/detector_svc/db.py
LIVE_DETECTORS: set[str] = {
    "TOOL_LOOP",
    "YOUR_NEW_DETECTOR",   # promote once precision > 80%
    …
}

This lets you validate a new detector against real traffic before it pages anyone.

Shadow signals in the dashboard

The Alerts page surfaces shadow signals in a dedicated section below the live alert groups. Dashed border, reduced opacity, SHADOW badge. The section only appears when at least one shadow signal exists.

curl "http://localhost:8002/v1/agents/my-agent/signals?include_shadow=true" \
  -H "Authorization: Bearer dt_dev_test"

How detection works

  1. Detector worker polls Postgres every 5 seconds for completed or stalled runs
  2. Fetches all events for that run from events
  3. Replays them into a RunState — tool calls, LLM calls, retrievals, durations
  4. Runs all 15 detectors against the state
  5. Writes any triggered FailureSignal rows
  6. Marks the run processed in processed_runs

Detection adds zero latency to the agent — it runs entirely after the run completes.