Skip to main content

Context Database Enhancement Analysis

Author: Claude Code Analysis Date: 2025-12-22 Status: Proposed Priority: P1 - Critical for MoE Agent Continuity


Executive Summary

The CODITECT context database system has a significant gap: agent session outputs are not being captured in the context database when run via /cx. This analysis identifies root causes and proposes targeted enhancements to context-db.py and unified-message-extractor.py.

Key Findings

FindingImpactStatus
Agent JSONL files ARE discoverable (505 files)N/A✅ Working
Agent JSONL files CAN be extracted (19 messages from test file)N/A✅ Working
Agent-specific metadata NOT preserved (agentId, parentUuid, etc.)Loss of agent lineage tracking❌ Gap
Only 14 agent messages in 94K total messages (0.01%)99.99% of agent work invisible❌ Gap
Batch /cx not run after recent agent sessionsNew content not indexed❌ Gap

Root Cause

The extraction pipeline works correctly but has two gaps:

  1. No automatic capture on agent completion - Agent outputs only enter the database if /cx is manually run
  2. Agent metadata loss - Fields like agentId, parentUuid, isSidechain are not preserved

Current Architecture

Agent Runs → ~/.claude/projects/agent-*.jsonl (stored)

[MANUAL] User runs /cx

unified-message-extractor.py → context-storage/unified_messages.jsonl

context-db.py index_messages() → context.db (SQLite FTS5)

/cxq queries

Statistics

MetricValue
Total JSONL files1,040
Agent JSONL files505 (48.6%)
Main session files535 (51.4%)
unified_messages.jsonl lines94,147
Messages with agentId14 (0.01%)
Last extractionDec 22, 2025 15:05

Gap Analysis

Gap 1: Agent Metadata Not Preserved

Agent JSONL entry structure:

{
"parentUuid": "b101bc78-e0c9-4824-b815-3e94f6cc3ba3",
"isSidechain": true,
"userType": "external",
"sessionId": "60cacdf6-403f-4350-8abc-42111016a762",
"agentId": "ab782dc",
"slug": "kind-chasing-hennessy",
"type": "user",
"message": { "role": "user", "content": "..." },
"uuid": "62a64845-97ca-4ee9-967b-930196b50567",
"timestamp": "2025-12-23T03:52:49.876Z"
}

Unified message format (current):

{
"hash": "...",
"content": "...",
"role": "user",
"provenance": {
"source_file": "/.../agent-ab782dc.jsonl",
"session_id": "agent-ab782dc", // ← Derived from filename only
"source_line": 1
}
}

Lost metadata:

  • parentUuid - Links to parent session conversation
  • agentId - Unique agent identifier
  • isSidechain - Whether this is a branching conversation
  • uuid - Individual message UUID for threading
  • slug - Human-readable session name
  • version - Claude Code version that ran the agent

Gap 2: No Automatic Agent Capture

Agent sessions complete without triggering context extraction. Users must manually run /cx to capture outputs, which is easily forgotten.

Impact: MoE expert panel outputs from today's session were NOT in the database until manually extracted during this analysis.

Gap 3: Query Limitations for Agent Content

Current context-db.py queries don't distinguish:

  • Agent vs main session messages
  • Parent-child relationships between sessions
  • Sidechain conversations

Enhancement Proposals

Enhancement 1: Preserve Agent Metadata (P0)

File: scripts/unified-message-extractor.py Function: _parse_entry() and create_unified_message()

Change: Add agent-specific fields to unified message format.

New unified format:

{
"hash": "...",
"content": "...",
"role": "user",
"provenance": {
"source_type": "jsonl",
"source_file": "...",
"session_id": "60cacdf6-403f-4350-8abc-42111016a762",
"source_line": 1
},
"agent_context": {
"agent_id": "ab782dc",
"parent_uuid": "b101bc78-e0c9-4824-b815-3e94f6cc3ba3",
"is_sidechain": true,
"uuid": "62a64845-97ca-4ee9-967b-930196b50567",
"slug": "kind-chasing-hennessy"
}
}

Implementation notes:

  • Add agent_context field in create_unified_message()
  • Extract from entry in _parse_entry() when fields present
  • Keep null/empty for main session messages

Enhancement 2: Add Agent Query Support (P0)

File: scripts/context-db.py

New flags:

/cxq --agents              # Show all agent messages only
/cxq --agent-id ab782dc # Messages from specific agent
/cxq --parent-session UUID # Messages linked to parent session
/cxq --sidechains # Show sidechain conversations

Schema change (new column):

ALTER TABLE messages ADD COLUMN agent_context TEXT;  -- JSON blob
CREATE INDEX idx_agent_id ON messages(json_extract(agent_context, '$.agent_id'));

Enhancement 3: Auto-Capture Agent Outputs (P1)

File: hooks/post-agent-completion.md (new)

Mechanism: Claude Code hook that triggers context extraction after Task tool completion.

# .claude/hooks/post-agent-completion.yaml
name: auto-capture-agent
trigger: task_completion
condition: agent_type != "main"
action: |
python3 scripts/unified-message-extractor.py \
--jsonl ~/.claude/projects/*-*/agent-${AGENT_ID}.jsonl \
--no-index

Alternative: Background watcher daemon (more complex but more reliable).

Enhancement 4: Agent Lineage Visualization (P2)

New command: /cxq --agent-tree

Session: 60cacdf6-403f-4350-8abc-42111016a762
├─ User: "Run MoE pattern for CodiFlow"
├─ Assistant: "I'll coordinate 5 experts..."
│ ├─ [Agent: ab782dc] Architecture Judge
│ │ └─ 19 messages, 36/40 score
│ ├─ [Agent: a02eb04] Quality Judge
│ │ └─ 23 messages, 7/40 score
│ └─ ...
└─ Assistant: "Judges complete, synthesizing..."

Implementation Priority

EnhancementPriorityEffortImpact
Preserve agent metadataP04hEnables all other features
Add agent query supportP06hMakes agent work discoverable
Auto-capture agent outputsP18hPrevents data loss
Agent lineage visualizationP28hUX improvement

Total estimated effort: 26 hours


Immediate Workaround

Until enhancements are implemented, manually capture agent outputs:

# After completing MoE or multi-agent workflows:
/cx

# Verify agent outputs captured:
python3 scripts/context-db.py --recent 50 | grep -i agent

  • scripts/unified-message-extractor.py - Current extraction logic
  • scripts/context-db.py - Current query system
  • internal/architecture/adrs/ADR-005-* - Token tracking ADR
  • docs/guides/MEMORY-MANAGEMENT-GUIDE.md - User documentation

Decision Required

Recommendation: Implement Enhancements 1 and 2 (P0) immediately to address the MoE judge output capture issue discovered today.

Approval Status: PENDING


Last Updated: 2025-12-22T20:00:00Z Compliance: CODITECT Research Standard v1.0.0