Memory Retrieval Skill

How to Use This Skill

Review the patterns and examples below
Apply the relevant patterns to your implementation
Follow the best practices outlined in this skill

Intelligent long-term memory retrieval that surfaces relevant prior work context while optimizing for signal-to-noise ratio and token efficiency. Integrates with the CODITECT anti-forgetting system to enable informed sessions.

When to Use This Skill

Use this skill when:

Starting a new session that continues prior work
Orchestrator needs context about project history
Looking for relevant decisions, patterns, or error solutions
Checking work status (in-progress, blocked, completed)
Avoiding duplicate work or conflicting decisions
Need to understand "what was done before" on a topic

Don't use this skill when:

Simple queries answerable from current context
Creating brand new project with no history
Memory system not yet initialized (/cx not run)

What It Automates

Before: (Manual context gathering)

# Search for relevant history manually
/cxq "authentication"
/cxq --decisions
/cxq --patterns --language python
# Read through dozens of results
# Manually summarize what's relevant
# Hope you didn't miss anything important
# Consume 5000+ tokens on raw output

After: (Automated intelligent retrieval)

# Single command with optimized output
python3 scripts/memory-retrieval.py --topic "authentication" --budget 2000

# Result: Structured, relevant context
# - Work status summary
# - Key decisions (ranked by relevance)
# - Applicable patterns
# - Known issues/solutions
# - Token count: ~1500 (within budget)

Core Features

1. Hierarchical Memory Architecture

L0: Working Memory     → Current session context
L1: Short-Term Memory  → Last 7 days (decisions, WIP, blockers)
L2: Long-Term Memory   → Historical knowledge (patterns, errors, conventions)

2. Multi-Granular Retrieval

Retrieves multiple knowledge types in parallel:

Decisions: Architecture, technology, API choices
Patterns: Reusable code implementations
Errors: Problem-solution pairs
Work Status: In-progress, blocked, completed items

3. Relevance Scoring

score = base_similarity * freshness_weight * project_boost * type_priority

Semantic similarity: How related to current query
Freshness weight: Recent items score higher
Project boost: Same project items prioritized
Type priority: Decisions/blockers > patterns > general notes

4. Token-Aware Compression

Summarize low-priority items (save tokens)
Preserve full detail for high-priority items
Hierarchical expansion (summary → details on demand)
Strict budget enforcement

Usage

Basic Retrieval

# Standard retrieval (2000 token budget)
python3 scripts/memory-retrieval.py --topic "database migration"

# Quick status check (500 token budget)
python3 scripts/memory-retrieval.py --status --budget 500

# Deep context (5000 token budget)
python3 scripts/memory-retrieval.py --topic "API design" --budget 5000 --detailed

Focused Retrieval

# Only decisions
python3 scripts/memory-retrieval.py --decisions --topic "authentication"

# Only patterns for a language
python3 scripts/memory-retrieval.py --patterns --language rust

# Only errors matching a type
python3 scripts/memory-retrieval.py --errors --type "TypeError"

# Only work status
python3 scripts/memory-retrieval.py --status

Output Formats

# Summary format (default) - for context injection
python3 scripts/memory-retrieval.py --topic "auth" --format summary

# Detailed format - for deep understanding
python3 scripts/memory-retrieval.py --topic "auth" --format detailed

# JSON format - for programmatic use
python3 scripts/memory-retrieval.py --topic "auth" --format json

Orchestrator Integration

# Called by orchestrators for context gathering
Task(
    subagent_type="memory-context-agent",
    prompt="""
    Retrieve context for: "Implement user dashboard"
    Budget: 2000 tokens
    Format: summary
    Focus: decisions, patterns, work_status
    """
)

Output Examples

Summary Format

## Prior Context Summary (1847 tokens)

**Work Status:**
- [IN_PROGRESS] User service refactor - 70% (yesterday)
- [BLOCKED] Payment integration - awaiting Stripe keys
- [COMPLETED] Database schema v2 (3 days ago)

**Relevant Decisions:**
1. PostgreSQL for ACID compliance (ADR-003)
2. Repository pattern for data access
3. JWT with 24h/7d token lifecycle

**Applicable Patterns:**
- async_handler pattern (Python) - error handling
- repository_base class - data access

**Known Issues:**
- Connection pool exhaustion - fixed with max_connections=20

JSON Format

{
  "retrieval_id": "mem_20251212_143022",
  "topic": "user authentication",
  "token_count": 1847,
  "work_status": [
    {
      "title": "User service refactor",
      "status": "in_progress",
      "progress": 70,
      "last_updated": "2025-12-11"
    }
  ],
  "decisions": [
    {
      "id": "dec_001",
      "summary": "Use PostgreSQL for ACID compliance",
      "type": "database",
      "relevance_score": 0.92,
      "source": "ADR-003"
    }
  ],
  "patterns": [...],
  "errors": [...],
  "expand_available": ["full_decisions", "pattern_code", "error_details"]
}

Configuration

Environment Variables

# Relevance thresholds
MEMORY_HIGH_RELEVANCE=0.8
MEMORY_MEDIUM_RELEVANCE=0.5
MEMORY_MIN_RELEVANCE=0.3

# Token budgets
MEMORY_BUDGET_MINIMAL=500
MEMORY_BUDGET_STANDARD=2000
MEMORY_BUDGET_COMPREHENSIVE=5000

# Freshness weights
MEMORY_FRESHNESS_TODAY=1.0
MEMORY_FRESHNESS_WEEK=0.8
MEMORY_FRESHNESS_MONTH=0.5

Config File

{
  "memory_retrieval": {
    "default_budget": 2000,
    "default_format": "summary",
    "relevance_threshold": 0.3,
    "max_results_per_type": 10,
    "enable_semantic_search": true,
    "enable_compression": true
  }
}

Integration Points

1. Session Start Hook

# Automatically retrieve context at session start
# See: hooks/pre-session-memory.md

2. Orchestrator Planning Phase

# Orchestrators call memory-context-agent before planning
# Ensures plans account for prior work

3. Error Resolution

# When error detected, auto-retrieve similar error solutions
/recall --errors "TypeError in async handler"

4. Decision Making

# Before making architectural decisions
/recall --decisions --topic "database choice"

Token Efficiency Analysis

Retrieval Type	Raw Output	With Skill	Savings
Full recall	5000+ tokens	2000 tokens	60%
Status check	2000+ tokens	500 tokens	75%
Decision lookup	3000+ tokens	800 tokens	73%
Pattern search	4000+ tokens	1000 tokens	75%

Average savings: 70% token reduction while maintaining context quality.

Error Handling

# Graceful degradation
try:
    results = memory_retrieval(topic, budget)
except DatabaseNotFound:
    return "Memory system not initialized. Run /cx first."
except NoRelevantResults:
    return "No relevant prior context found for: {topic}"
except BudgetExceeded:
    return compressed_results[:budget]

Agent: memory-context-agent - Orchestrates retrieval
Command: /recall - User-facing interface
Script: scripts/memory-retrieval.py - Core implementation
Hook: hooks/pre-session-memory.md - Auto-retrieval trigger
Database: context-storage/context.db - Memory storage

Success Metrics

Relevance Rate: >80% of results rated useful
Token Efficiency: 60-75% savings vs raw retrieval
Coverage: 100% of blockers/WIP surfaced
Latency: <2 seconds for standard retrieval

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: memory-retrieval

Completed:
- [x] Context retrieved for topic: "{topic}"
- [x] Relevance scoring applied
- [x] Token budget enforced ({actual} / {budget} tokens)
- [x] Multi-granular retrieval (decisions, patterns, errors, status)

Outputs:
- Prior context summary (markdown format)
- Token count: {count} tokens ({percent}% of budget)
- Results: {decisions} decisions, {patterns} patterns, {errors} errors, {status} work items

Relevance Distribution:
- High relevance (>0.8): X items
- Medium relevance (0.5-0.8): X items
- Low relevance (0.3-0.5): X items

Freshness:
- Last 24 hours: X items
- Last 7 days: X items
- Older: X items

Completion Checklist

Before marking this skill as complete, verify:

Context database exists and contains data (/cx has been run)
Retrieval query executed against all memory types (decisions/patterns/errors/status)
Relevance scores calculated for all results
Results ranked by composite score (similarity × freshness × type priority)
Token count calculated and within budget
Low-priority items summarized (not full detail)
High-priority items include full context
Output formatted correctly (summary/detailed/json as requested)
No relevant blockers or WIP items missed

Failure Indicators

This skill has FAILED if:

❌ Database not found (context.db missing, /cx never run)
❌ Query returns no results when relevant context exists
❌ Token budget exceeded (output truncated mid-item)
❌ Relevance scoring broken (all items score 0 or 1)
❌ Missing critical blockers from recent sessions
❌ Stale results prioritized over fresh ones (freshness weighting broken)
❌ JSON output malformed (syntax errors)
❌ Retrieval latency >5 seconds for standard query

When NOT to Use

Do NOT use this skill when:

Context database not initialized (run /cx first)
Brand new project with no history (no context to retrieve)
Simple question answerable from current session (no memory lookup needed)
Real-time data required (memory is historical, not current)
Memory system disabled for privacy/compliance reasons
Working in isolated sandbox (no shared context across sessions)

Use alternatives:

/cxq - For direct SQL queries to context database
grep/search - For searching source code or documentation
/recall --status - For quick work status check only
current-session-context - For analyzing active conversation

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Retrieving all history	Token explosion (10K+ tokens)	Always set budget (default 2000)
Ignoring relevance scores	Low-quality results included	Filter by min_relevance threshold
No freshness weighting	Outdated items ranked first	Apply time decay to scores
Single-type retrieval	Missing relevant context from other types	Query all types (decisions/patterns/errors)
Manual summarization	Inconsistent compression	Use automated summarization
No semantic search	Miss synonyms and related concepts	Enable sentence-transformers if available
Forgetting project context	Wrong project results returned	Always filter by current project ID

Principles

This skill embodies:

#1 Search Before Create - Retrieves prior work to avoid duplication
#5 Eliminate Ambiguity - Relevance scores make ranking objective
#6 Clear, Understandable, Explainable - Structured output with source citations
#8 No Assumptions - Validates database exists before querying
Token Efficiency - 60-75% savings through intelligent compression
Signal-to-Noise Optimization - Prioritizes recent, relevant, high-value items
Progressive Disclosure - Summary format with option to expand details

Standard: CODITECT-STANDARD-AUTOMATION.md

Version: 1.1.0 Updated: 2026-01-04 | Quality Standard Applied Dependencies: context-db.py, sentence-transformers (optional)

How to Use This Skill​

When to Use This Skill​

What It Automates​

Core Features​

1. Hierarchical Memory Architecture​

2. Multi-Granular Retrieval​

3. Relevance Scoring​

4. Token-Aware Compression​

Usage​

Basic Retrieval​

Focused Retrieval​

Output Formats​

Orchestrator Integration​

Output Examples​

Summary Format​

JSON Format​

Configuration​

Environment Variables​

Config File​

Integration Points​

1. Session Start Hook​

2. Orchestrator Planning Phase​

3. Error Resolution​

4. Decision Making​

Token Efficiency Analysis​

Error Handling​

Related Components​

Success Metrics​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​