Memory system

Why Memory matters

Session context is short-lived, but users need to:

Keep cross-session facts (customer preferences, past decisions)
Attach domain knowledge bases (patent corpora, automotive standards, policies)
Let Agents retrieve relevant background on demand

Memory tiers

Tier	Lifecycle	Typical content
Session memory	Current session	Dialogue history, working variables
User memory	Cross-session, per user	Preferences, terminology
Tenant memory	Cross-user, per org	Policies, glossaries, templates
Knowledge base	External	Domain docs, references, historical outputs

Retrieval

The system supports:

Semantic search — vector similarity
Keyword search — exact match
Structured queries — metadata (time, author, tags)

Cross-session search

With the Graft skill, Agents can search and reference outputs across Sessions when the user authorizes it.

From KB to LTM

KB (Knowledge Base) and LTM (Long-Term Memory) serve different roles:

	KB	LTM
Granularity	Coarse (whole doc / one Bug record)	Fine (single fact / preference)
Storage	Original text + vector index	Structured facts + provenance metadata
Purpose	Precise RAG on raw content	Retrieve processed patterns and insights
Write path	User import or system artifacts	Conversation or scheduled processors

They are not substitutes but layers — KB is raw material; LTM is patterns distilled from that material.

Two memory flows

LTM facts arrive through two parallel streams:

Rendering diagram…

Flow 1 (live): per-turn, fine-grained answers “how should the Agent talk to me”; carried by memory-sdk.

Flow 2 (KB batch): periodic batch, aggregated answers “what patterns emerged for this project / org”; carried by processor-class Skills.

Processor pattern (generic template)

Any Skill that turns structured KB rows into LTM facts follows the same shape:

Stage	Content
Trigger	Platform LTM cron (default ~6h), not real-time
Incremental	Only rows without `ltm_extracted_at`
Aggregation	Multi-dimensional rollups (typically 2–4 dims)
confidence	Baseline + count weighting + severity cap at 0.95
Provenance	`source_*_ids` list source records
Update vs insert	Compare `source_*_ids`
False-positive rollback	When source rows are invalidated, next cron recomputes
Capacity guard	Per-run cap + max fact length + max provenance ids

Implemented processors

Processor	Input (KB)	Output (LTM)	Consumer
bug-import	Historical Bug records	four-dimensional `bug_pattern` facts	L2 code-review

Future processors (requirement-change, review comments, incident reports, …) can follow the same pattern—new processors are Skills; use skill-architect modeled on bug-import.

Why processing is scheduled

Reason	Explanation
Patterns need volume	Single rows don’t show a pattern; need ≥N rows
Cost control	LLM + aggregation cost batches in cron windows, not blocking chat
Avoid half-baked facts	Real-time processing on first row yields “one-piece evidence” noise
Match business rhythm	Bugs / requirement batches are naturally periodic

Session
Graft
Cron — how processors are triggered
memory-sdk
bug-import

Memory system

On this page