Skip to main content

Memory Retrieval Skill

Memory Retrieval Skill

How to Use This Skill

  1. Review the patterns and examples below
  2. Apply the relevant patterns to your implementation
  3. Follow the best practices outlined in this skill

Intelligent long-term memory retrieval that surfaces relevant prior work context while optimizing for signal-to-noise ratio and token efficiency. Integrates with the CODITECT anti-forgetting system to enable informed sessions.

When to Use This Skill

Use this skill when:

  • Starting a new session that continues prior work
  • Orchestrator needs context about project history
  • Looking for relevant decisions, patterns, or error solutions
  • Checking work status (in-progress, blocked, completed)
  • Avoiding duplicate work or conflicting decisions
  • Need to understand "what was done before" on a topic

Don't use this skill when:

  • Simple queries answerable from current context
  • Creating brand new project with no history
  • Memory system not yet initialized (/cx not run)

What It Automates

Before: (Manual context gathering)

# Search for relevant history manually
/cxq "authentication"
/cxq --decisions
/cxq --patterns --language python
# Read through dozens of results
# Manually summarize what's relevant
# Hope you didn't miss anything important
# Consume 5000+ tokens on raw output

After: (Automated intelligent retrieval)

# Single command with optimized output
python3 scripts/memory-retrieval.py --topic "authentication" --budget 2000

# Result: Structured, relevant context
# - Work status summary
# - Key decisions (ranked by relevance)
# - Applicable patterns
# - Known issues/solutions
# - Token count: ~1500 (within budget)

Core Features

1. Hierarchical Memory Architecture

L0: Working Memory     → Current session context
L1: Short-Term Memory → Last 7 days (decisions, WIP, blockers)
L2: Long-Term Memory → Historical knowledge (patterns, errors, conventions)

2. Multi-Granular Retrieval

Retrieves multiple knowledge types in parallel:

  • Decisions: Architecture, technology, API choices
  • Patterns: Reusable code implementations
  • Errors: Problem-solution pairs
  • Work Status: In-progress, blocked, completed items

3. Relevance Scoring

score = base_similarity * freshness_weight * project_boost * type_priority
  • Semantic similarity: How related to current query
  • Freshness weight: Recent items score higher
  • Project boost: Same project items prioritized
  • Type priority: Decisions/blockers > patterns > general notes

4. Token-Aware Compression

  • Summarize low-priority items (save tokens)
  • Preserve full detail for high-priority items
  • Hierarchical expansion (summary → details on demand)
  • Strict budget enforcement

Usage

Basic Retrieval

# Standard retrieval (2000 token budget)
python3 scripts/memory-retrieval.py --topic "database migration"

# Quick status check (500 token budget)
python3 scripts/memory-retrieval.py --status --budget 500

# Deep context (5000 token budget)
python3 scripts/memory-retrieval.py --topic "API design" --budget 5000 --detailed

Focused Retrieval

# Only decisions
python3 scripts/memory-retrieval.py --decisions --topic "authentication"

# Only patterns for a language
python3 scripts/memory-retrieval.py --patterns --language rust

# Only errors matching a type
python3 scripts/memory-retrieval.py --errors --type "TypeError"

# Only work status
python3 scripts/memory-retrieval.py --status

Output Formats

# Summary format (default) - for context injection
python3 scripts/memory-retrieval.py --topic "auth" --format summary

# Detailed format - for deep understanding
python3 scripts/memory-retrieval.py --topic "auth" --format detailed

# JSON format - for programmatic use
python3 scripts/memory-retrieval.py --topic "auth" --format json

Orchestrator Integration

# Called by orchestrators for context gathering
Task(
subagent_type="memory-context-agent",
prompt="""
Retrieve context for: "Implement user dashboard"
Budget: 2000 tokens
Format: summary
Focus: decisions, patterns, work_status
"""
)

Output Examples

Summary Format

## Prior Context Summary (1847 tokens)

**Work Status:**
- [IN_PROGRESS] User service refactor - 70% (yesterday)
- [BLOCKED] Payment integration - awaiting Stripe keys
- [COMPLETED] Database schema v2 (3 days ago)

**Relevant Decisions:**
1. PostgreSQL for ACID compliance (ADR-003)
2. Repository pattern for data access
3. JWT with 24h/7d token lifecycle

**Applicable Patterns:**
- async_handler pattern (Python) - error handling
- repository_base class - data access

**Known Issues:**
- Connection pool exhaustion - fixed with max_connections=20

JSON Format

{
"retrieval_id": "mem_20251212_143022",
"topic": "user authentication",
"token_count": 1847,
"work_status": [
{
"title": "User service refactor",
"status": "in_progress",
"progress": 70,
"last_updated": "2025-12-11"
}
],
"decisions": [
{
"id": "dec_001",
"summary": "Use PostgreSQL for ACID compliance",
"type": "database",
"relevance_score": 0.92,
"source": "ADR-003"
}
],
"patterns": [...],
"errors": [...],
"expand_available": ["full_decisions", "pattern_code", "error_details"]
}

Configuration

Environment Variables

# Relevance thresholds
MEMORY_HIGH_RELEVANCE=0.8
MEMORY_MEDIUM_RELEVANCE=0.5
MEMORY_MIN_RELEVANCE=0.3

# Token budgets
MEMORY_BUDGET_MINIMAL=500
MEMORY_BUDGET_STANDARD=2000
MEMORY_BUDGET_COMPREHENSIVE=5000

# Freshness weights
MEMORY_FRESHNESS_TODAY=1.0
MEMORY_FRESHNESS_WEEK=0.8
MEMORY_FRESHNESS_MONTH=0.5

Config File

{
"memory_retrieval": {
"default_budget": 2000,
"default_format": "summary",
"relevance_threshold": 0.3,
"max_results_per_type": 10,
"enable_semantic_search": true,
"enable_compression": true
}
}

Integration Points

1. Session Start Hook

# Automatically retrieve context at session start
# See: hooks/pre-session-memory.md

2. Orchestrator Planning Phase

# Orchestrators call memory-context-agent before planning
# Ensures plans account for prior work

3. Error Resolution

# When error detected, auto-retrieve similar error solutions
/recall --errors "TypeError in async handler"

4. Decision Making

# Before making architectural decisions
/recall --decisions --topic "database choice"

Token Efficiency Analysis

Retrieval TypeRaw OutputWith SkillSavings
Full recall5000+ tokens2000 tokens60%
Status check2000+ tokens500 tokens75%
Decision lookup3000+ tokens800 tokens73%
Pattern search4000+ tokens1000 tokens75%

Average savings: 70% token reduction while maintaining context quality.

Error Handling

# Graceful degradation
try:
results = memory_retrieval(topic, budget)
except DatabaseNotFound:
return "Memory system not initialized. Run /cx first."
except NoRelevantResults:
return "No relevant prior context found for: {topic}"
except BudgetExceeded:
return compressed_results[:budget]
  • Agent: memory-context-agent - Orchestrates retrieval
  • Command: /recall - User-facing interface
  • Script: scripts/memory-retrieval.py - Core implementation
  • Hook: hooks/pre-session-memory.md - Auto-retrieval trigger
  • Database: context-storage/context.db - Memory storage

Success Metrics

  • Relevance Rate: >80% of results rated useful
  • Token Efficiency: 60-75% savings vs raw retrieval
  • Coverage: 100% of blockers/WIP surfaced
  • Latency: <2 seconds for standard retrieval

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: memory-retrieval

Completed:
- [x] Context retrieved for topic: "{topic}"
- [x] Relevance scoring applied
- [x] Token budget enforced ({actual} / {budget} tokens)
- [x] Multi-granular retrieval (decisions, patterns, errors, status)

Outputs:
- Prior context summary (markdown format)
- Token count: {count} tokens ({percent}% of budget)
- Results: {decisions} decisions, {patterns} patterns, {errors} errors, {status} work items

Relevance Distribution:
- High relevance (>0.8): X items
- Medium relevance (0.5-0.8): X items
- Low relevance (0.3-0.5): X items

Freshness:
- Last 24 hours: X items
- Last 7 days: X items
- Older: X items

Completion Checklist

Before marking this skill as complete, verify:

  • Context database exists and contains data (/cx has been run)
  • Retrieval query executed against all memory types (decisions/patterns/errors/status)
  • Relevance scores calculated for all results
  • Results ranked by composite score (similarity × freshness × type priority)
  • Token count calculated and within budget
  • Low-priority items summarized (not full detail)
  • High-priority items include full context
  • Output formatted correctly (summary/detailed/json as requested)
  • No relevant blockers or WIP items missed

Failure Indicators

This skill has FAILED if:

  • ❌ Database not found (context.db missing, /cx never run)
  • ❌ Query returns no results when relevant context exists
  • ❌ Token budget exceeded (output truncated mid-item)
  • ❌ Relevance scoring broken (all items score 0 or 1)
  • ❌ Missing critical blockers from recent sessions
  • ❌ Stale results prioritized over fresh ones (freshness weighting broken)
  • ❌ JSON output malformed (syntax errors)
  • ❌ Retrieval latency >5 seconds for standard query

When NOT to Use

Do NOT use this skill when:

  • Context database not initialized (run /cx first)
  • Brand new project with no history (no context to retrieve)
  • Simple question answerable from current session (no memory lookup needed)
  • Real-time data required (memory is historical, not current)
  • Memory system disabled for privacy/compliance reasons
  • Working in isolated sandbox (no shared context across sessions)

Use alternatives:

  • /cxq - For direct SQL queries to context database
  • grep/search - For searching source code or documentation
  • /recall --status - For quick work status check only
  • current-session-context - For analyzing active conversation

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Retrieving all historyToken explosion (10K+ tokens)Always set budget (default 2000)
Ignoring relevance scoresLow-quality results includedFilter by min_relevance threshold
No freshness weightingOutdated items ranked firstApply time decay to scores
Single-type retrievalMissing relevant context from other typesQuery all types (decisions/patterns/errors)
Manual summarizationInconsistent compressionUse automated summarization
No semantic searchMiss synonyms and related conceptsEnable sentence-transformers if available
Forgetting project contextWrong project results returnedAlways filter by current project ID

Principles

This skill embodies:

  • #1 Search Before Create - Retrieves prior work to avoid duplication
  • #5 Eliminate Ambiguity - Relevance scores make ranking objective
  • #6 Clear, Understandable, Explainable - Structured output with source citations
  • #8 No Assumptions - Validates database exists before querying
  • Token Efficiency - 60-75% savings through intelligent compression
  • Signal-to-Noise Optimization - Prioritizes recent, relevant, high-value items
  • Progressive Disclosure - Summary format with option to expand details

Standard: CODITECT-STANDARD-AUTOMATION.md


Version: 1.1.0 Updated: 2026-01-04 | Quality Standard Applied Dependencies: context-db.py, sentence-transformers (optional)