Overview
Agents hit an 85–90% reliability ceiling without self-correction. The ReflectionManager adds a critique loop that catches errors, detects stuck loops, and records post-mortem lessons for continuous improvement.Quick Start
Configuration
How it runs (v2.5+)
Whenreflection.enabled is set, every agent.run() automatically:
- Critiques the output with the critic model (correctness, completeness, relevance, clarity → 0–1 score)
- Revises if the critique fails — the feedback is appended to the conversation and the agent produces an improved response, up to
maxReflectionscycles. Usage from revisions accumulates intooutput.usage(and cost tracking). - Emits a
reflection.critiqueevent per critique pass — consumed byMetricsExporterasavgCritiqueScore(see Observability) - Attaches the final verdict to the output:
Confidence-gated escalation
output.critique.score is the hook for routing low-confidence outputs to human review instead of shipping them:
Features
Output Critique
After the LLM generates a response, the reflection manager evaluates it on:- Correctness — factual accuracy, no hallucinations
- Completeness — addresses all parts of the query
- Relevance — stays on topic
- Clarity — well-structured output
maxReflections times).
Loop Escape Detection
Detects when agents get stuck calling the same tool with the same arguments repeatedly:Post-Mortem Learning
When a run fails and memory is available, the reflection manager generates a lesson and stores it inLearnedKnowledge:
Plan Critique
Before executing tool calls, review the planned actions:Using a Cheaper Critic
Save costs by using a smaller model for critique:Events
| Event | Payload |
|---|---|
reflection.critique | { runId, pass, score, feedback } |
reflection.loop.escaped | { runId, tool, repeatCount } |
reflection.postmortem | { runId, lesson, category } |