Memory Retrieval Skill
Memory Retrieval Skill
How to Use This Skill
- Review the patterns and examples below
- Apply the relevant patterns to your implementation
- Follow the best practices outlined in this skill
Intelligent long-term memory retrieval that surfaces relevant prior work context while optimizing for signal-to-noise ratio and token efficiency. Integrates with the CODITECT anti-forgetting system to enable informed sessions.
When to Use This Skill
Use this skill when:
- Starting a new session that continues prior work
- Orchestrator needs context about project history
- Looking for relevant decisions, patterns, or error solutions
- Checking work status (in-progress, blocked, completed)
- Avoiding duplicate work or conflicting decisions
- Need to understand "what was done before" on a topic
Don't use this skill when:
- Simple queries answerable from current context
- Creating brand new project with no history
- Memory system not yet initialized (
/cxnot run)
What It Automates
Before: (Manual context gathering)
# Search for relevant history manually
/cxq "authentication"
/cxq --decisions
/cxq --patterns --language python
# Read through dozens of results
# Manually summarize what's relevant
# Hope you didn't miss anything important
# Consume 5000+ tokens on raw output
After: (Automated intelligent retrieval)
# Single command with optimized output
python3 scripts/memory-retrieval.py --topic "authentication" --budget 2000
# Result: Structured, relevant context
# - Work status summary
# - Key decisions (ranked by relevance)
# - Applicable patterns
# - Known issues/solutions
# - Token count: ~1500 (within budget)
Core Features
1. Hierarchical Memory Architecture
L0: Working Memory → Current session context
L1: Short-Term Memory → Last 7 days (decisions, WIP, blockers)
L2: Long-Term Memory → Historical knowledge (patterns, errors, conventions)
2. Multi-Granular Retrieval
Retrieves multiple knowledge types in parallel:
- Decisions: Architecture, technology, API choices
- Patterns: Reusable code implementations
- Errors: Problem-solution pairs
- Work Status: In-progress, blocked, completed items
3. Relevance Scoring
score = base_similarity * freshness_weight * project_boost * type_priority
- Semantic similarity: How related to current query
- Freshness weight: Recent items score higher
- Project boost: Same project items prioritized
- Type priority: Decisions/blockers > patterns > general notes
4. Token-Aware Compression
- Summarize low-priority items (save tokens)
- Preserve full detail for high-priority items
- Hierarchical expansion (summary → details on demand)
- Strict budget enforcement
Usage
Basic Retrieval
# Standard retrieval (2000 token budget)
python3 scripts/memory-retrieval.py --topic "database migration"
# Quick status check (500 token budget)
python3 scripts/memory-retrieval.py --status --budget 500
# Deep context (5000 token budget)
python3 scripts/memory-retrieval.py --topic "API design" --budget 5000 --detailed
Focused Retrieval
# Only decisions
python3 scripts/memory-retrieval.py --decisions --topic "authentication"
# Only patterns for a language
python3 scripts/memory-retrieval.py --patterns --language rust
# Only errors matching a type
python3 scripts/memory-retrieval.py --errors --type "TypeError"
# Only work status
python3 scripts/memory-retrieval.py --status
Output Formats
# Summary format (default) - for context injection
python3 scripts/memory-retrieval.py --topic "auth" --format summary
# Detailed format - for deep understanding
python3 scripts/memory-retrieval.py --topic "auth" --format detailed
# JSON format - for programmatic use
python3 scripts/memory-retrieval.py --topic "auth" --format json
Orchestrator Integration
# Called by orchestrators for context gathering
Task(
subagent_type="memory-context-agent",
prompt="""
Retrieve context for: "Implement user dashboard"
Budget: 2000 tokens
Format: summary
Focus: decisions, patterns, work_status
"""
)
Output Examples
Summary Format
## Prior Context Summary (1847 tokens)
**Work Status:**
- [IN_PROGRESS] User service refactor - 70% (yesterday)
- [BLOCKED] Payment integration - awaiting Stripe keys
- [COMPLETED] Database schema v2 (3 days ago)
**Relevant Decisions:**
1. PostgreSQL for ACID compliance (ADR-003)
2. Repository pattern for data access
3. JWT with 24h/7d token lifecycle
**Applicable Patterns:**
- async_handler pattern (Python) - error handling
- repository_base class - data access
**Known Issues:**
- Connection pool exhaustion - fixed with max_connections=20
JSON Format
{
"retrieval_id": "mem_20251212_143022",
"topic": "user authentication",
"token_count": 1847,
"work_status": [
{
"title": "User service refactor",
"status": "in_progress",
"progress": 70,
"last_updated": "2025-12-11"
}
],
"decisions": [
{
"id": "dec_001",
"summary": "Use PostgreSQL for ACID compliance",
"type": "database",
"relevance_score": 0.92,
"source": "ADR-003"
}
],
"patterns": [...],
"errors": [...],
"expand_available": ["full_decisions", "pattern_code", "error_details"]
}
Configuration
Environment Variables
# Relevance thresholds
MEMORY_HIGH_RELEVANCE=0.8
MEMORY_MEDIUM_RELEVANCE=0.5
MEMORY_MIN_RELEVANCE=0.3
# Token budgets
MEMORY_BUDGET_MINIMAL=500
MEMORY_BUDGET_STANDARD=2000
MEMORY_BUDGET_COMPREHENSIVE=5000
# Freshness weights
MEMORY_FRESHNESS_TODAY=1.0
MEMORY_FRESHNESS_WEEK=0.8
MEMORY_FRESHNESS_MONTH=0.5
Config File
{
"memory_retrieval": {
"default_budget": 2000,
"default_format": "summary",
"relevance_threshold": 0.3,
"max_results_per_type": 10,
"enable_semantic_search": true,
"enable_compression": true
}
}
Integration Points
1. Session Start Hook
# Automatically retrieve context at session start
# See: hooks/pre-session-memory.md
2. Orchestrator Planning Phase
# Orchestrators call memory-context-agent before planning
# Ensures plans account for prior work
3. Error Resolution
# When error detected, auto-retrieve similar error solutions
/recall --errors "TypeError in async handler"
4. Decision Making
# Before making architectural decisions
/recall --decisions --topic "database choice"
Token Efficiency Analysis
| Retrieval Type | Raw Output | With Skill | Savings |
|---|---|---|---|
| Full recall | 5000+ tokens | 2000 tokens | 60% |
| Status check | 2000+ tokens | 500 tokens | 75% |
| Decision lookup | 3000+ tokens | 800 tokens | 73% |
| Pattern search | 4000+ tokens | 1000 tokens | 75% |
Average savings: 70% token reduction while maintaining context quality.
Error Handling
# Graceful degradation
try:
results = memory_retrieval(topic, budget)
except DatabaseNotFound:
return "Memory system not initialized. Run /cx first."
except NoRelevantResults:
return "No relevant prior context found for: {topic}"
except BudgetExceeded:
return compressed_results[:budget]
Related Components
- Agent:
memory-context-agent- Orchestrates retrieval - Command:
/recall- User-facing interface - Script:
scripts/memory-retrieval.py- Core implementation - Hook:
hooks/pre-session-memory.md- Auto-retrieval trigger - Database:
context-storage/context.db- Memory storage
Success Metrics
- Relevance Rate: >80% of results rated useful
- Token Efficiency: 60-75% savings vs raw retrieval
- Coverage: 100% of blockers/WIP surfaced
- Latency: <2 seconds for standard retrieval
Success Output
When successful, this skill MUST output:
✅ SKILL COMPLETE: memory-retrieval
Completed:
- [x] Context retrieved for topic: "{topic}"
- [x] Relevance scoring applied
- [x] Token budget enforced ({actual} / {budget} tokens)
- [x] Multi-granular retrieval (decisions, patterns, errors, status)
Outputs:
- Prior context summary (markdown format)
- Token count: {count} tokens ({percent}% of budget)
- Results: {decisions} decisions, {patterns} patterns, {errors} errors, {status} work items
Relevance Distribution:
- High relevance (>0.8): X items
- Medium relevance (0.5-0.8): X items
- Low relevance (0.3-0.5): X items
Freshness:
- Last 24 hours: X items
- Last 7 days: X items
- Older: X items
Completion Checklist
Before marking this skill as complete, verify:
- Context database exists and contains data (/cx has been run)
- Retrieval query executed against all memory types (decisions/patterns/errors/status)
- Relevance scores calculated for all results
- Results ranked by composite score (similarity × freshness × type priority)
- Token count calculated and within budget
- Low-priority items summarized (not full detail)
- High-priority items include full context
- Output formatted correctly (summary/detailed/json as requested)
- No relevant blockers or WIP items missed
Failure Indicators
This skill has FAILED if:
- ❌ Database not found (context.db missing, /cx never run)
- ❌ Query returns no results when relevant context exists
- ❌ Token budget exceeded (output truncated mid-item)
- ❌ Relevance scoring broken (all items score 0 or 1)
- ❌ Missing critical blockers from recent sessions
- ❌ Stale results prioritized over fresh ones (freshness weighting broken)
- ❌ JSON output malformed (syntax errors)
- ❌ Retrieval latency >5 seconds for standard query
When NOT to Use
Do NOT use this skill when:
- Context database not initialized (run /cx first)
- Brand new project with no history (no context to retrieve)
- Simple question answerable from current session (no memory lookup needed)
- Real-time data required (memory is historical, not current)
- Memory system disabled for privacy/compliance reasons
- Working in isolated sandbox (no shared context across sessions)
Use alternatives:
- /cxq - For direct SQL queries to context database
- grep/search - For searching source code or documentation
- /recall --status - For quick work status check only
- current-session-context - For analyzing active conversation
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Retrieving all history | Token explosion (10K+ tokens) | Always set budget (default 2000) |
| Ignoring relevance scores | Low-quality results included | Filter by min_relevance threshold |
| No freshness weighting | Outdated items ranked first | Apply time decay to scores |
| Single-type retrieval | Missing relevant context from other types | Query all types (decisions/patterns/errors) |
| Manual summarization | Inconsistent compression | Use automated summarization |
| No semantic search | Miss synonyms and related concepts | Enable sentence-transformers if available |
| Forgetting project context | Wrong project results returned | Always filter by current project ID |
Principles
This skill embodies:
- #1 Search Before Create - Retrieves prior work to avoid duplication
- #5 Eliminate Ambiguity - Relevance scores make ranking objective
- #6 Clear, Understandable, Explainable - Structured output with source citations
- #8 No Assumptions - Validates database exists before querying
- Token Efficiency - 60-75% savings through intelligent compression
- Signal-to-Noise Optimization - Prioritizes recent, relevant, high-value items
- Progressive Disclosure - Summary format with option to expand details
Standard: CODITECT-STANDARD-AUTOMATION.md
Version: 1.1.0 Updated: 2026-01-04 | Quality Standard Applied Dependencies: context-db.py, sentence-transformers (optional)