Context Database Enhancement Analysis
Author: Claude Code Analysis Date: 2025-12-22 Status: Proposed Priority: P1 - Critical for MoE Agent Continuity
Executive Summary
The CODITECT context database system has a significant gap: agent session outputs are not being captured in the context database when run via /cx. This analysis identifies root causes and proposes targeted enhancements to context-db.py and unified-message-extractor.py.
Key Findings
| Finding | Impact | Status |
|---|---|---|
| Agent JSONL files ARE discoverable (505 files) | N/A | ✅ Working |
| Agent JSONL files CAN be extracted (19 messages from test file) | N/A | ✅ Working |
| Agent-specific metadata NOT preserved (agentId, parentUuid, etc.) | Loss of agent lineage tracking | ❌ Gap |
| Only 14 agent messages in 94K total messages (0.01%) | 99.99% of agent work invisible | ❌ Gap |
Batch /cx not run after recent agent sessions | New content not indexed | ❌ Gap |
Root Cause
The extraction pipeline works correctly but has two gaps:
- No automatic capture on agent completion - Agent outputs only enter the database if
/cxis manually run - Agent metadata loss - Fields like
agentId,parentUuid,isSidechainare not preserved
Current Architecture
Agent Runs → ~/.claude/projects/agent-*.jsonl (stored)
↓
[MANUAL] User runs /cx
↓
unified-message-extractor.py → context-storage/unified_messages.jsonl
↓
context-db.py index_messages() → context.db (SQLite FTS5)
↓
/cxq queries
Statistics
| Metric | Value |
|---|---|
| Total JSONL files | 1,040 |
| Agent JSONL files | 505 (48.6%) |
| Main session files | 535 (51.4%) |
| unified_messages.jsonl lines | 94,147 |
| Messages with agentId | 14 (0.01%) |
| Last extraction | Dec 22, 2025 15:05 |
Gap Analysis
Gap 1: Agent Metadata Not Preserved
Agent JSONL entry structure:
{
"parentUuid": "b101bc78-e0c9-4824-b815-3e94f6cc3ba3",
"isSidechain": true,
"userType": "external",
"sessionId": "60cacdf6-403f-4350-8abc-42111016a762",
"agentId": "ab782dc",
"slug": "kind-chasing-hennessy",
"type": "user",
"message": { "role": "user", "content": "..." },
"uuid": "62a64845-97ca-4ee9-967b-930196b50567",
"timestamp": "2025-12-23T03:52:49.876Z"
}
Unified message format (current):
{
"hash": "...",
"content": "...",
"role": "user",
"provenance": {
"source_file": "/.../agent-ab782dc.jsonl",
"session_id": "agent-ab782dc", // ← Derived from filename only
"source_line": 1
}
}
Lost metadata:
parentUuid- Links to parent session conversationagentId- Unique agent identifierisSidechain- Whether this is a branching conversationuuid- Individual message UUID for threadingslug- Human-readable session nameversion- Claude Code version that ran the agent
Gap 2: No Automatic Agent Capture
Agent sessions complete without triggering context extraction. Users must manually run /cx to capture outputs, which is easily forgotten.
Impact: MoE expert panel outputs from today's session were NOT in the database until manually extracted during this analysis.
Gap 3: Query Limitations for Agent Content
Current context-db.py queries don't distinguish:
- Agent vs main session messages
- Parent-child relationships between sessions
- Sidechain conversations
Enhancement Proposals
Enhancement 1: Preserve Agent Metadata (P0)
File: scripts/unified-message-extractor.py
Function: _parse_entry() and create_unified_message()
Change: Add agent-specific fields to unified message format.
New unified format:
{
"hash": "...",
"content": "...",
"role": "user",
"provenance": {
"source_type": "jsonl",
"source_file": "...",
"session_id": "60cacdf6-403f-4350-8abc-42111016a762",
"source_line": 1
},
"agent_context": {
"agent_id": "ab782dc",
"parent_uuid": "b101bc78-e0c9-4824-b815-3e94f6cc3ba3",
"is_sidechain": true,
"uuid": "62a64845-97ca-4ee9-967b-930196b50567",
"slug": "kind-chasing-hennessy"
}
}
Implementation notes:
- Add
agent_contextfield increate_unified_message() - Extract from entry in
_parse_entry()when fields present - Keep null/empty for main session messages
Enhancement 2: Add Agent Query Support (P0)
File: scripts/context-db.py
New flags:
/cxq --agents # Show all agent messages only
/cxq --agent-id ab782dc # Messages from specific agent
/cxq --parent-session UUID # Messages linked to parent session
/cxq --sidechains # Show sidechain conversations
Schema change (new column):
ALTER TABLE messages ADD COLUMN agent_context TEXT; -- JSON blob
CREATE INDEX idx_agent_id ON messages(json_extract(agent_context, '$.agent_id'));
Enhancement 3: Auto-Capture Agent Outputs (P1)
File: hooks/post-agent-completion.md (new)
Mechanism: Claude Code hook that triggers context extraction after Task tool completion.
# .claude/hooks/post-agent-completion.yaml
name: auto-capture-agent
trigger: task_completion
condition: agent_type != "main"
action: |
python3 scripts/unified-message-extractor.py \
--jsonl ~/.claude/projects/*-*/agent-${AGENT_ID}.jsonl \
--no-index
Alternative: Background watcher daemon (more complex but more reliable).
Enhancement 4: Agent Lineage Visualization (P2)
New command: /cxq --agent-tree
Session: 60cacdf6-403f-4350-8abc-42111016a762
├─ User: "Run MoE pattern for CodiFlow"
├─ Assistant: "I'll coordinate 5 experts..."
│ ├─ [Agent: ab782dc] Architecture Judge
│ │ └─ 19 messages, 36/40 score
│ ├─ [Agent: a02eb04] Quality Judge
│ │ └─ 23 messages, 7/40 score
│ └─ ...
└─ Assistant: "Judges complete, synthesizing..."
Implementation Priority
| Enhancement | Priority | Effort | Impact |
|---|---|---|---|
| Preserve agent metadata | P0 | 4h | Enables all other features |
| Add agent query support | P0 | 6h | Makes agent work discoverable |
| Auto-capture agent outputs | P1 | 8h | Prevents data loss |
| Agent lineage visualization | P2 | 8h | UX improvement |
Total estimated effort: 26 hours
Immediate Workaround
Until enhancements are implemented, manually capture agent outputs:
# After completing MoE or multi-agent workflows:
/cx
# Verify agent outputs captured:
python3 scripts/context-db.py --recent 50 | grep -i agent
Related Documentation
scripts/unified-message-extractor.py- Current extraction logicscripts/context-db.py- Current query systeminternal/architecture/adrs/ADR-005-*- Token tracking ADRdocs/guides/MEMORY-MANAGEMENT-GUIDE.md- User documentation
Decision Required
Recommendation: Implement Enhancements 1 and 2 (P0) immediately to address the MoE judge output capture issue discovered today.
Approval Status: PENDING
Last Updated: 2025-12-22T20:00:00Z Compliance: CODITECT Research Standard v1.0.0