Memory System Design
Memory System Design
Memory provides the persistence layer that allows agents to maintain continuity across sessions and reason over accumulated knowledge. Simple agents rely entirely on context for memory, losing all state when sessions end. Sophisticated agents implement layered memory architectures that balance immediate context needs with long-term knowledge retention.
When to Use
✅ Use this skill when:
- Building agents that must persist across sessions
- Needing to maintain entity consistency across conversations
- Implementing reasoning over accumulated knowledge
- Designing systems that learn from past interactions
- Creating knowledge bases that grow over time
- Building temporal-aware systems that track state changes
❌ Don't use this skill when:
- Single-session agents with no persistence needs
- Stateless API endpoints
- Tasks where context window alone is sufficient
Core Concepts
Memory exists on a spectrum from immediate context to permanent storage. At one extreme, working memory in the context window provides zero-latency access but vanishes when sessions end. At the other extreme, permanent storage persists indefinitely but requires retrieval to enter context.
Key insight: Simple vector stores lack relationship and temporal structure. Knowledge graphs preserve relationships for reasoning. Temporal knowledge graphs add validity periods for time-aware queries.
Memory Architecture Layers
| Layer | Latency | Persistence | Use Case |
|---|---|---|---|
| Working Memory | Zero | Session-volatile | Context window, active reasoning |
| Short-Term Memory | Low | Session-persistent | Task state, conversation tracking |
| Long-Term Memory | Medium | Cross-session | User preferences, learned patterns |
| Entity Memory | Medium | Cross-session | Identity tracking, property storage |
| Temporal Knowledge Graph | Higher | Permanent | Time-aware queries, state history |
Why Vector Stores Fall Short
Vector RAG provides semantic retrieval by embedding queries and documents in a shared space. This works for document retrieval but lacks structure for agent memory.
Limitations:
- Relationship loss: Can retrieve "Customer X purchased Product Y" but cannot answer "What products did customers who purchased Y also buy?"
- Temporal blindness: No mechanism to distinguish current facts from outdated facts
- No entity consistency: Same entity mentioned differently may not be linked
Benchmark Performance Comparison
| Memory System | DMR Accuracy | Retrieval Latency | Notes |
|---|---|---|---|
| Zep (Temporal KG) | 94.8% | 2.58s | Best accuracy, fast retrieval |
| MemGPT | 93.4% | Variable | Good general performance |
| GraphRAG | ~75-85% | Variable | 20-35% gains over baseline |
| Vector RAG | ~60-70% | Fast | Loses relationship structure |
| Recursive Summarization | 35.3% | Low | Severe information loss |
Memory Implementation Patterns
Pattern 1: File-System-as-Memory
The file system itself serves as a memory layer. Simple, requires no additional infrastructure.
memory/{user_id}/
├── preferences.json # User preferences
├── entities/ # Entity directory
│ ├── project-alpha.json
│ └── customer-123.json
└── history/ # Temporal records
└── 2025-12-27.jsonl
Advantages: Simplicity, transparency, portability Disadvantages: No semantic search, no relationship tracking
Pattern 2: Vector RAG with Metadata
Vector stores enhanced with rich metadata provide semantic search with filtering.
# Store with temporal metadata
vector_store.add({
"content": "User prefers dark mode",
"entity_tags": ["user:123", "preference:ui"],
"valid_from": "2025-01-01",
"valid_until": None,
"confidence": 0.95
})
# Query with metadata filters
results = vector_store.search(
query="UI preferences",
filters={"entity_tags": "user:123", "valid_until": None}
)
Pattern 3: Knowledge Graph
Explicitly model entities and relationships for complex queries.
# Define relationships
graph.add_triple(
subject="user:123",
predicate="purchased",
object="product:456"
)
# Query relationships
products = graph.query("""
MATCH (u:User)-[:purchased]->(p:Product)
WHERE u.id = '123'
RETURN p
""")
Pattern 4: Temporal Knowledge Graph
Add validity periods to facts for time-travel queries.
# Store fact with temporal validity
temporal_graph.add_fact({
"subject": "user:123",
"predicate": "lives_at",
"object": "address:456",
"valid_from": "2024-01-01",
"valid_until": "2025-06-15" # Moved out
})
# Query: What was user's address on March 2024?
temporal_graph.query("""
MATCH (user)-[r:lives_at]->(address)
WHERE user.id = '123'
AND r.valid_from <= '2024-03-01'
AND (r.valid_until IS NULL OR r.valid_until > '2024-03-01')
RETURN address
""")
Memory Retrieval Patterns
| Pattern | Description | Use Case |
|---|---|---|
| Semantic | Embedding similarity search | Document retrieval |
| Entity-Based | Traverse graph relationships | Related entity queries |
| Temporal | Filter by validity period | Historical state queries |
| Hybrid | Combine semantic + structured | Complex multi-factor queries |
Memory Consolidation
Memories accumulate and require periodic consolidation:
- Identify outdated facts: Facts with expired validity
- Merge related facts: Combine redundant information
- Update validity periods: Refresh temporal metadata
- Archive obsolete: Move to cold storage
- Rebuild indexes: Optimize retrieval performance
Memory System Selection Guide
| Requirement | Recommended System |
|---|---|
| Simple persistence | File-system memory |
| Semantic search | Vector RAG with metadata |
| Relationship reasoning | Knowledge graph |
| Temporal validity | Temporal knowledge graph |
| Maximum accuracy | Zep or similar temporal KG |
Example: Entity Tracking
class EntityMemory:
def remember_entity(self, entity_id: str, properties: dict):
"""Track entity across conversations"""
self.memory.store({
"type": "entity",
"id": entity_id,
"properties": properties,
"last_updated": datetime.now().isoformat()
})
def get_entity(self, entity_id: str) -> dict:
"""Retrieve entity with all properties"""
return self.memory.retrieve_entity(entity_id)
def update_property(self, entity_id: str, key: str, value: any):
"""Update single property with temporal tracking"""
self.memory.update_with_history(entity_id, key, value)
Guidelines
- Match memory architecture to query requirements
- Implement progressive disclosure for memory access
- Use temporal validity to prevent outdated information conflicts
- Consolidate memories periodically to prevent unbounded growth
- Design for memory retrieval failures gracefully
- Consider privacy implications of persistent memory
- Implement backup and recovery for critical memories
- Monitor memory growth and performance over time
Success Output
When this skill is successfully applied, output:
✅ SKILL COMPLETE: memory-systems
Completed:
- [x] Memory architecture selected (file-system/vector/knowledge-graph/temporal)
- [x] Memory layers implemented (working/short-term/long-term/entity/temporal)
- [x] Retrieval patterns configured (semantic/entity-based/temporal/hybrid)
- [x] Consolidation strategy defined
Outputs:
- Memory system implementation files
- Configuration for chosen architecture
- Documentation of memory layers and access patterns
Completion Checklist
Before marking this skill as complete, verify:
- Memory architecture selected based on requirements
- At least 2 memory layers implemented (e.g., working + long-term)
- Retrieval mechanism functional (can store and retrieve)
- Temporal validity implemented if using temporal KG
- Entity consistency maintained across memory operations
- Memory consolidation strategy documented
- Privacy and security considerations addressed
- Backup and recovery procedures established
Failure Indicators
This skill has FAILED if:
- ❌ Memory system cannot persist data across sessions
- ❌ Retrieval returns outdated or conflicting information
- ❌ Entity references are inconsistent or broken
- ❌ Memory grows unbounded without consolidation
- ❌ Temporal queries return wrong historical states
- ❌ Privacy requirements violated (e.g., data leakage)
- ❌ Recovery from corruption not possible
When NOT to Use
Do NOT use this skill when:
- Single-session agents with no persistence needs - use context window only
- Stateless API endpoints - no memory required between requests
- Tasks where context window alone is sufficient - simpler alternative available
- Real-time systems where retrieval latency is unacceptable - use in-memory cache instead
- Data must remain ephemeral for compliance - use session-only storage
- System cannot tolerate additional infrastructure - stick with file-based memory
Use alternatives instead:
- Single session → Context window management
- Simple persistence → File-system memory
- Compliance constraints → Ephemeral storage patterns
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Using vector store for relationship queries | Cannot answer "who purchased what" type queries | Use knowledge graph or temporal KG |
| No temporal validity tracking | Outdated facts conflict with current facts | Add validity periods to all facts |
| Unbounded memory growth | Performance degrades, storage costs increase | Implement consolidation and archival |
| Single memory layer | No separation between working and long-term | Implement layered architecture |
| No entity consistency | Same entity has multiple IDs/representations | Use entity tracking with canonical IDs |
| Skipping backup strategy | Data loss on corruption | Implement regular backups and recovery |
| Ignoring privacy implications | Compliance violations, data leakage | Design with privacy from the start |
Principles
This skill embodies:
- #2 First Principles Thinking - Understand memory requirements before selecting architecture
- #3 Keep It Simple (KISS) - Start with file-system memory, upgrade only when needed
- #4 Separation of Concerns - Separate memory layers (working/short/long/entity/temporal)
- #5 Eliminate Ambiguity - Clear entity identification and consistency rules
- #8 No Assumptions - Verify memory retrieval success, don't assume data exists
Full Standard: CODITECT-STANDARD-AUTOMATION.md
Related Components
Skills
context-fundamentals- Context basics (prerequisite)context-optimization- Memory-based context loadingmulti-agent-patterns- Shared memory across agents
Agents
context-health-analyst- Memory system monitoring
Scripts
external/Agent-Skills-for-Context-Engineering/skills/memory-systems/scripts/memory_store.py- Vector store and knowledge graph implementations