Memory System Implementation Guide

Building Multi-Layer Memory Architecture for Agentic AI

Document ID: A2-MEMORY-GUIDE
Version: 1.0
Category: P1 - Implementation Guides

Memory Architecture Overview

The Four Memory Layers

Layer	Persistence	Capacity	Access Speed	Update Frequency
Parametric	Permanent	Billions params	Instant	Training only
Short-term	Session	8K-200K tokens	Instant	Every turn
Long-term	Persistent	Unlimited	50-500ms	As needed
Audit	Immutable	Unlimited	100-1000ms	Every action

Short-Term Memory

Context Window Management

class ContextWindow:
    max_tokens: int = 128000
    reserve_tokens: int = 4000
    
    def add_message(self, message):
        if message.token_count > self.available_tokens:
            self._truncate_for_space(message.token_count)
        self.messages.append(message)
    
    def _truncate_for_space(self, needed):
        # 1. Remove old tool results
        # 2. Summarize middle section
        # 3. Apply sliding window

Session State

Current task and progress
Extracted entities
User preferences (session-level)
Topic stack and pending questions

Working Memory (Scratchpad)

Current reasoning chain
Intermediate computations
Hypotheses being evaluated
Retrieved evidence buffer

Long-Term Memory

Vector Store Implementation

class VectorStore:
    async def upsert(self, id, vector, metadata) -> bool
    async def query(self, vector, top_k, filter) -> List[Tuple]
    async def delete(self, id) -> bool

Recommended Backends:

< 1M vectors: pgvector
1M vectors: Pinecone, Weaviate

Knowledge Graph

Entity nodes with embeddings
Relationship edges with weights
Semantic + structural queries
Neo4j or Amazon Neptune

Episodic Memory

class Episode:
    task: str
    actions: List[Dict]
    outcome: str
    success: bool
    reflections: List[str]
    lessons: List[str]
    embedding: List[float]

Audit Memory

Immutable Action Log

class AuditRecord:
    id: str
    timestamp: float
    action_type: ActionType  # tool_call, decision, state_change
    agent_id: str
    action_name: str
    action_input: Dict
    action_output: Dict
    previous_hash: str  # Chain integrity
    record_hash: str

Decision Trace

Decision point with context
Options considered
Selected option with rationale
Confidence score
Relevant memories/evidence used

Memory Patterns by Paradigm

Paradigm	Primary Memory	Key Components
LSR	Context window	Minimal retrieval
GS	Long-term + Evidence ledger	Vector store, citations
EP	Episodic + Reflexion	Working memory, learning
VE	Audit + State register	Immutable logs, protocols

Unified Memory Manager

class MemoryManager:
    # Short-term
    def get_session(self, session_id) -> SessionState
    def get_working_memory(self, session_id) -> WorkingMemory
    
    # Long-term
    async def store_knowledge(self, content, metadata, type)
    async def retrieve_knowledge(self, query, top_k, type)
    
    # Episodic
    async def store_episode(self, episode)
    async def recall_similar_episodes(self, query, top_k)
    
    # Audit
    async def log_action(self, **kwargs) -> AuditRecord
    async def record_decision(self, **kwargs) -> DecisionPoint
    
    # Composite
    async def build_context(self, session_id, query, 
                           include_episodes, include_knowledge)

Performance Optimization

Caching Strategies

Cache Type	TTL	Use Case
Embedding cache	24h	Repeated queries
Query result cache	5m	Hot queries
Session cache	Session	Active sessions

Scaling Recommendations

Component	Horizontal	Vertical
Vector store	✓ (sharding)	✓
Knowledge graph	Limited	✓
Audit log	✓ (partitioning)	✓
Session state	✓ (Redis cluster)	Limited

Document maintained by CODITECT Platform Team

Building Multi-Layer Memory Architecture for Agentic AI​

Memory Architecture Overview​

The Four Memory Layers​

Short-Term Memory​

Context Window Management​

Session State​

Working Memory (Scratchpad)​

Long-Term Memory​

Vector Store Implementation​

Knowledge Graph​

Episodic Memory​

Audit Memory​

Immutable Action Log​

Decision Trace​

Memory Patterns by Paradigm​

Unified Memory Manager​

Performance Optimization​

Caching Strategies​

Scaling Recommendations​

Building Multi-Layer Memory Architecture for Agentic AI

Memory Architecture Overview

The Four Memory Layers

Short-Term Memory

Context Window Management

Session State

Working Memory (Scratchpad)

Long-Term Memory

Vector Store Implementation

Knowledge Graph

Episodic Memory

Audit Memory

Immutable Action Log

Decision Trace

Memory Patterns by Paradigm

Unified Memory Manager

Performance Optimization

Caching Strategies

Scaling Recommendations