Correction Capture
In plain terms
Corrections are first-class, structured records of a human fixing an agent’s output — “the charge code should have beenDTHC, not THC”. Each correction captures exactly what was wrong, what it should have been, and why — then gets embedded into a vector store and retrieved on future relevant runs.
The analogy: a red-pen edit on a draft. The agent doesn’t just get the fix once — it keeps the marked-up page and checks it before writing anything similar again.Unlike Learnings (free-text insights), corrections are structured: field-level, anchored to the run that produced the mistake, and groupable by real-world entity (a vendor, a customer) for accuracy analytics.
When to use it
- Human-in-the-loop review workflows — invoice reconciliation, customs filings, document extraction, anywhere a reviewer fixes agent output before it ships.
- High-variance decision spaces — every vendor names charges differently; each correction permanently teaches the agent that vendor’s convention.
- Accuracy measurement — corrections-per-run is the inverse of first-pass accuracy. See Observability for the exported metrics.
When NOT to use it
- General insights (“customs holds explain most delays”) → Learnings.
- Outcome tracking for agent decisions → Decision Log
record_outcome. - Deterministic tasks with a finite rule set — if a prompt update permanently fixes the issue, you don’t need a learning loop.
Configuration
| Property | Type | Required | Default | What it controls |
|---|---|---|---|---|
vectorStore | VectorStore | Yes | — | Where corrections are indexed for semantic retrieval |
collection | string | No | "agentium_corrections" | The named bucket inside the vector store |
topK | number | No | 3 | How many relevant corrections to inject per run |
minScore | number | No | none | Relevance floor (0–1) for retrieval. Recommended 0.3–0.5 |
invalidateContradicted | boolean | No | true | Auto-invalidate unverified (llm-extracted) learnings that semantically collide with a new correction |
contradictionThreshold | number | No | 0.85 | Similarity floor for contradiction invalidation |
The Correction record
Corrections default to
agent scope (unlike learnings, which default to user). Fixing an agent’s output is workflow knowledge — every user of that agent should benefit. The same four-level scope hierarchy applies for reads.Three ways to record a correction
1. HTTP endpoint (review UIs, backend services)
Every agent served viacreateAgentRouter() with corrections enabled gets:
201 with the stored record, or 404 if corrections aren’t configured for the agent.
2. Programmatic API
3. Agent tools (in-conversation)
When a user points out a mistake mid-conversation, the agent records it itself via the auto-exposedrecord_correction tool.
Retrieval at inference time
On every run, the most relevant corrections are semantically matched against the current input and injected into the system prompt:Accuracy analytics
CorrectionStore exposes the raw material for accuracy dashboards:
memory.correction.recorded event, which MetricsExporter consumes to compute per-agent correctionsTotal and correctionRate — exported via Prometheus and JSON.
Self-corrective invalidation (v2.5+)
A human correction is authoritative. By default, recording one automatically invalidates any unverified (llm-extracted) learnings that semantically collide with it (≥contradictionThreshold similarity) — the stale AI hypothesis is retired and the correction supersedes it. Human-authored learnings are never auto-invalidated. Each invalidation emits a memory.learning.invalidated event.
Disable with corrections: { invalidateContradicted: false }.
Regression evals (v2.5+)
Corrections recorded withoriginalInput become replayable test cases — proof the learning loop works:
Tools
| Tool | What it does |
|---|---|
record_correction | Record a correction when the user points out a mistake |
search_corrections | Semantic search over past corrections (optionally filtered by entityKey) |
Cross-references
- Learned Knowledge — free-text insights (vs. structured corrections)
- Multi-User Isolation — the scope hierarchy in full
- Observability — correction-rate metrics
- Express Transport — the corrections HTTP endpoint