Skip to main content

Memory System Design

Memory System Design

Memory provides the persistence layer that allows agents to maintain continuity across sessions and reason over accumulated knowledge. Simple agents rely entirely on context for memory, losing all state when sessions end. Sophisticated agents implement layered memory architectures that balance immediate context needs with long-term knowledge retention.

When to Use

Use this skill when:

  • Building agents that must persist across sessions
  • Needing to maintain entity consistency across conversations
  • Implementing reasoning over accumulated knowledge
  • Designing systems that learn from past interactions
  • Creating knowledge bases that grow over time
  • Building temporal-aware systems that track state changes

Don't use this skill when:

  • Single-session agents with no persistence needs
  • Stateless API endpoints
  • Tasks where context window alone is sufficient

Core Concepts

Memory exists on a spectrum from immediate context to permanent storage. At one extreme, working memory in the context window provides zero-latency access but vanishes when sessions end. At the other extreme, permanent storage persists indefinitely but requires retrieval to enter context.

Key insight: Simple vector stores lack relationship and temporal structure. Knowledge graphs preserve relationships for reasoning. Temporal knowledge graphs add validity periods for time-aware queries.

Memory Architecture Layers

LayerLatencyPersistenceUse Case
Working MemoryZeroSession-volatileContext window, active reasoning
Short-Term MemoryLowSession-persistentTask state, conversation tracking
Long-Term MemoryMediumCross-sessionUser preferences, learned patterns
Entity MemoryMediumCross-sessionIdentity tracking, property storage
Temporal Knowledge GraphHigherPermanentTime-aware queries, state history

Why Vector Stores Fall Short

Vector RAG provides semantic retrieval by embedding queries and documents in a shared space. This works for document retrieval but lacks structure for agent memory.

Limitations:

  • Relationship loss: Can retrieve "Customer X purchased Product Y" but cannot answer "What products did customers who purchased Y also buy?"
  • Temporal blindness: No mechanism to distinguish current facts from outdated facts
  • No entity consistency: Same entity mentioned differently may not be linked

Benchmark Performance Comparison

Memory SystemDMR AccuracyRetrieval LatencyNotes
Zep (Temporal KG)94.8%2.58sBest accuracy, fast retrieval
MemGPT93.4%VariableGood general performance
GraphRAG~75-85%Variable20-35% gains over baseline
Vector RAG~60-70%FastLoses relationship structure
Recursive Summarization35.3%LowSevere information loss

Memory Implementation Patterns

Pattern 1: File-System-as-Memory

The file system itself serves as a memory layer. Simple, requires no additional infrastructure.

memory/{user_id}/
├── preferences.json # User preferences
├── entities/ # Entity directory
│ ├── project-alpha.json
│ └── customer-123.json
└── history/ # Temporal records
└── 2025-12-27.jsonl

Advantages: Simplicity, transparency, portability Disadvantages: No semantic search, no relationship tracking

Pattern 2: Vector RAG with Metadata

Vector stores enhanced with rich metadata provide semantic search with filtering.

# Store with temporal metadata
vector_store.add({
"content": "User prefers dark mode",
"entity_tags": ["user:123", "preference:ui"],
"valid_from": "2025-01-01",
"valid_until": None,
"confidence": 0.95
})

# Query with metadata filters
results = vector_store.search(
query="UI preferences",
filters={"entity_tags": "user:123", "valid_until": None}
)

Pattern 3: Knowledge Graph

Explicitly model entities and relationships for complex queries.

# Define relationships
graph.add_triple(
subject="user:123",
predicate="purchased",
object="product:456"
)

# Query relationships
products = graph.query("""
MATCH (u:User)-[:purchased]->(p:Product)
WHERE u.id = '123'
RETURN p
""")

Pattern 4: Temporal Knowledge Graph

Add validity periods to facts for time-travel queries.

# Store fact with temporal validity
temporal_graph.add_fact({
"subject": "user:123",
"predicate": "lives_at",
"object": "address:456",
"valid_from": "2024-01-01",
"valid_until": "2025-06-15" # Moved out
})

# Query: What was user's address on March 2024?
temporal_graph.query("""
MATCH (user)-[r:lives_at]->(address)
WHERE user.id = '123'
AND r.valid_from <= '2024-03-01'
AND (r.valid_until IS NULL OR r.valid_until > '2024-03-01')
RETURN address
""")

Memory Retrieval Patterns

PatternDescriptionUse Case
SemanticEmbedding similarity searchDocument retrieval
Entity-BasedTraverse graph relationshipsRelated entity queries
TemporalFilter by validity periodHistorical state queries
HybridCombine semantic + structuredComplex multi-factor queries

Memory Consolidation

Memories accumulate and require periodic consolidation:

  1. Identify outdated facts: Facts with expired validity
  2. Merge related facts: Combine redundant information
  3. Update validity periods: Refresh temporal metadata
  4. Archive obsolete: Move to cold storage
  5. Rebuild indexes: Optimize retrieval performance

Memory System Selection Guide

RequirementRecommended System
Simple persistenceFile-system memory
Semantic searchVector RAG with metadata
Relationship reasoningKnowledge graph
Temporal validityTemporal knowledge graph
Maximum accuracyZep or similar temporal KG

Example: Entity Tracking

class EntityMemory:
def remember_entity(self, entity_id: str, properties: dict):
"""Track entity across conversations"""
self.memory.store({
"type": "entity",
"id": entity_id,
"properties": properties,
"last_updated": datetime.now().isoformat()
})

def get_entity(self, entity_id: str) -> dict:
"""Retrieve entity with all properties"""
return self.memory.retrieve_entity(entity_id)

def update_property(self, entity_id: str, key: str, value: any):
"""Update single property with temporal tracking"""
self.memory.update_with_history(entity_id, key, value)

Guidelines

  1. Match memory architecture to query requirements
  2. Implement progressive disclosure for memory access
  3. Use temporal validity to prevent outdated information conflicts
  4. Consolidate memories periodically to prevent unbounded growth
  5. Design for memory retrieval failures gracefully
  6. Consider privacy implications of persistent memory
  7. Implement backup and recovery for critical memories
  8. Monitor memory growth and performance over time

Success Output

When this skill is successfully applied, output:

✅ SKILL COMPLETE: memory-systems

Completed:
- [x] Memory architecture selected (file-system/vector/knowledge-graph/temporal)
- [x] Memory layers implemented (working/short-term/long-term/entity/temporal)
- [x] Retrieval patterns configured (semantic/entity-based/temporal/hybrid)
- [x] Consolidation strategy defined

Outputs:
- Memory system implementation files
- Configuration for chosen architecture
- Documentation of memory layers and access patterns

Completion Checklist

Before marking this skill as complete, verify:

  • Memory architecture selected based on requirements
  • At least 2 memory layers implemented (e.g., working + long-term)
  • Retrieval mechanism functional (can store and retrieve)
  • Temporal validity implemented if using temporal KG
  • Entity consistency maintained across memory operations
  • Memory consolidation strategy documented
  • Privacy and security considerations addressed
  • Backup and recovery procedures established

Failure Indicators

This skill has FAILED if:

  • ❌ Memory system cannot persist data across sessions
  • ❌ Retrieval returns outdated or conflicting information
  • ❌ Entity references are inconsistent or broken
  • ❌ Memory grows unbounded without consolidation
  • ❌ Temporal queries return wrong historical states
  • ❌ Privacy requirements violated (e.g., data leakage)
  • ❌ Recovery from corruption not possible

When NOT to Use

Do NOT use this skill when:

  • Single-session agents with no persistence needs - use context window only
  • Stateless API endpoints - no memory required between requests
  • Tasks where context window alone is sufficient - simpler alternative available
  • Real-time systems where retrieval latency is unacceptable - use in-memory cache instead
  • Data must remain ephemeral for compliance - use session-only storage
  • System cannot tolerate additional infrastructure - stick with file-based memory

Use alternatives instead:

  • Single session → Context window management
  • Simple persistence → File-system memory
  • Compliance constraints → Ephemeral storage patterns

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Using vector store for relationship queriesCannot answer "who purchased what" type queriesUse knowledge graph or temporal KG
No temporal validity trackingOutdated facts conflict with current factsAdd validity periods to all facts
Unbounded memory growthPerformance degrades, storage costs increaseImplement consolidation and archival
Single memory layerNo separation between working and long-termImplement layered architecture
No entity consistencySame entity has multiple IDs/representationsUse entity tracking with canonical IDs
Skipping backup strategyData loss on corruptionImplement regular backups and recovery
Ignoring privacy implicationsCompliance violations, data leakageDesign with privacy from the start

Principles

This skill embodies:

  • #2 First Principles Thinking - Understand memory requirements before selecting architecture
  • #3 Keep It Simple (KISS) - Start with file-system memory, upgrade only when needed
  • #4 Separation of Concerns - Separate memory layers (working/short/long/entity/temporal)
  • #5 Eliminate Ambiguity - Clear entity identification and consistency rules
  • #8 No Assumptions - Verify memory retrieval success, don't assume data exists

Full Standard: CODITECT-STANDARD-AUTOMATION.md

Skills

  • context-fundamentals - Context basics (prerequisite)
  • context-optimization - Memory-based context loading
  • multi-agent-patterns - Shared memory across agents

Agents

  • context-health-analyst - Memory system monitoring

Scripts

  • external/Agent-Skills-for-Context-Engineering/skills/memory-systems/scripts/memory_store.py - Vector store and knowledge graph implementations