Observability
Why observability matters
In enterprise settings, AI must be not only usable but governable:
- Compliance audits — every judgment needs a paper trail
- Incident triage — why did the Agent decide this?
- Cost governance — token spend, model calls, external API usage
- Quality traceability — every step toward a conclusion is replayable
Four observability layers
| Layer | What you observe | Typical tooling |
|---|---|---|
| Business | Sessions, skill calls, delivered artifacts | Workbench audit logs |
| Agent | Thoughts, tool calls, sub-agent delegation | Session execution trace |
| System | Model calls, tokens, latency | Prometheus + Grafana |
| Infrastructure | CPU / memory / network / storage | Standard cloud monitoring |
Session replay
Every session can be fully replayed:
Rendering diagram…
The replay panel includes:
- Timestamp for each step
- Each model call (prompt + completion)
- Each tool call (arguments + return value)
- Token usage and cost per step
This enables minute-level incident reproduction and compliance auditing.
Ledger: disciplined step execution
Ledger is the Orchestrator’s core component—it records incrementally:
step_statustracking (pending / running / success / failed)- Confidence for each matched item
- Low-confidence semantic backtracking for multi-turn validation
- Discipline checks and replan fuse conditions
Agents cannot skip Ledger writes—it is write-then-act, not “log when done.”
Distributed tracing
Built on OpenTelemetry:
Rendering diagram…
- Trace ID spans the full request path
- Calls across agents, sub-agents, and external APIs are linked
- Integrates with standard APM systems (Jaeger, Tempo, DataDog, etc.)
Cost visibility
Aggregate by:
- Tenant / user / session / skill
- Model / time window
- Success / failure
Useful for model cost optimization and budget control.
Related docs
- Hooks — low-level entry point for observability hooks
- Monitoring & operations
- Security & compliance