Skip to main content

CODITECT Metadata Gaps Analysis

What's missing to enable full visibility into multi-agent workflows

Executive Summary

Current CODITECT infrastructure captures 70% of needed data. Critical gaps exist in:

  1. Agent Identity & Handoffs (30% gap)
  2. Intent Signals (40% gap)
  3. Session Context (25% gap)
  4. Causality Links (50% gap)
  5. Real-time State (60% gap)

Current State: What We Have ✅

1. Message Content

Source: unique_messages.jsonl

{
"hash": "abc123...",
"message": {
"role": "user" | "assistant",
"content": "text content"
},
"first_seen": "2025-11-17T20:00:00Z",
"checkpoint": "checkpoint-id"
}

What's good:

  • ✅ Message content preserved
  • ✅ Role (user/assistant)
  • ✅ Timestamp
  • ✅ Checkpoint linkage

What's missing:

  • ❌ Which specific AI agent (if assistant)
  • ❌ Session ID (can't group related messages)
  • ❌ Parent message reference (conversation threading)
  • ❌ Intent/purpose of message
  • ❌ User identity beyond "user"

2. Git Data

Source: .git/ repository

git log --format="%H|%an|%ae|%at|%s"
abc123|Hal Casteel|hal@az1.ai|1700237400|Add OAuth2 middleware

What's good:

  • ✅ Commit hash, author, email, timestamp, message
  • ✅ File changes (git show --name-status)
  • ✅ Diff stats (insertions/deletions)

What's missing:

  • ❌ Link to conversation that led to commit
  • ❌ Link to TASKLIST item being addressed
  • ❌ Link to PROJECT-PLAN goal
  • ❌ AI agent assistance metadata (which agent helped)
  • ❌ Session ID during which commit was made

3. Strategic Planning

Source: project-plan.md

## Phase 1: Foundation
- [ ] Setup authentication system
- [ ] Database schema design

What's good:

  • ✅ Goals and phases
  • ✅ Checkbox state
  • ✅ Structured hierarchy

What's missing:

  • ❌ Timestamps (when goal was added/completed)
  • ❌ Owner/assignee
  • ❌ Priority/urgency
  • ❌ Dependencies between goals
  • ❌ Estimated vs actual effort
  • ❌ Link to related TASKLIST items

4. Tactical Execution

Source: tasklist.md

- [x] Setup OAuth2 middleware
- [ ] Add JWT token validation

What's good:

  • ✅ Task description
  • ✅ Checkbox state

What's missing:

  • ❌ Task ID (stable identifier)
  • ❌ Created/started/completed timestamps
  • ❌ Owner/assignee
  • ❌ AI agents used
  • ❌ Related files
  • ❌ Parent PROJECT-PLAN goal
  • ❌ Estimated effort
  • ❌ Actual effort spent
  • ❌ Blockers/dependencies

Critical Metadata Gaps

Gap #1: Agent Identity & Tracking

Problem: Can't distinguish between different AI agents

Current state:

{
"role": "assistant" // Which assistant? Claude? Specialized agent?
}

Needed:

{
"role": "assistant",
"agent": {
"type": "specialized",
"name": "rust-expert-developer",
"invocation_method": "Task tool proxy",
"capabilities": ["async Rust", "tokio", "actix-web"]
}
}

Impact: HIGH

  • Can't measure agent effectiveness
  • Can't track agent-to-agent handoffs
  • Can't recommend best agent for task

Implementation needed:

  • Agent detection from content patterns
  • Explicit agent metadata in messages
  • Agent registry/catalog

Gap #2: Session Context

Problem: Can't group related messages into coherent sessions

Current state:

  • Messages are timestamped but not grouped
  • No session boundaries
  • Can't distinguish "planning session" from "implementation session"

Needed:

{
"session": {
"id": "session-2025-11-17-oauth2-impl",
"type": "implementation",
"started_at": "2025-11-17T14:00:00Z",
"ended_at": "2025-11-17T16:30:00Z",
"context": {
"project": "coditect-cloud-backend",
"feature": "OAuth2 authentication",
"phase": "Phase 1: Foundation"
},
"participants": [
{"type": "human", "name": "Hal Casteel"},
{"type": "ai_agent", "name": "rust-expert-developer"}
]
}
}

Impact: HIGH

  • Can't understand context shifts
  • Can't measure session productivity
  • Can't resume interrupted sessions

Implementation needed:

  • Session detection algorithm
  • Session metadata storage
  • Session boundary markers in exports

Gap #3: Intent Signals

Problem: Can't automatically determine WHY work is being done

Current state:

  • Must infer intent from content
  • No explicit goal linkage
  • No activity classification

Needed:

{
"intent": {
"activity_type": "implementing",
"feature": "OAuth2 middleware",
"component": "authentication",
"linked_task": "TASK-003",
"linked_goal": "Setup authentication system",
"priority": "P0"
}
}

Impact: MEDIUM

  • Requires fuzzy matching (error-prone)
  • Manual classification needed
  • Weak task→goal correlation

Implementation needed:

  • Structured intent metadata in messages
  • Auto-detection from message patterns
  • User-prompted intent clarification

Problem: Can't trace cause-effect relationships

Current state:

  • Messages exist in isolation
  • Commits exist in isolation
  • No explicit links between:
    • Conversation → Code change
    • Task completion → Checkpoint
    • Problem → Solution

Needed:

{
"message_id": "msg-789",
"caused_by": ["msg-785", "msg-786"], // Parent messages
"caused": [
{"type": "git_commit", "hash": "abc123"},
{"type": "task_update", "task_id": "TASK-003", "status": "completed"}
]
}

Impact: VERY HIGH

  • Can't build causal graph
  • Can't trace decisions to outcomes
  • Can't answer "why did we do this?"

Implementation needed:

  • Message threading/parent references
  • Explicit causality metadata
  • Event correlation engine

Gap #5: Real-time State

Problem: Current state must be inferred from historical data

Current state:

  • "What's being worked on now?" requires querying latest messages
  • "Which task is active?" requires parsing TASKLIST
  • No live state indicator

Needed:

{
"current_state": {
"active_session": "session-123",
"active_tasks": ["TASK-003", "TASK-005"],
"active_files": ["src/middleware/jwt.rs"],
"active_agents": ["rust-expert-developer"],
"context": {
"feature": "JWT validation",
"activity": "implementing",
"phase": "Phase 1: Foundation"
},
"last_updated": "2025-11-17T20:30:00Z"
}
}

Impact: HIGH

  • Dashboard must reconstruct state (slow)
  • No "right now" indicator
  • Can't detect stalled work

Implementation needed:

  • Live state tracking service
  • State snapshot on every event
  • State persistence mechanism

Gap #6: Multi-Agent Workflow Metadata

Problem: No metadata for agent-to-agent collaboration

Current state:

  • Can't track agent handoffs
  • Can't measure agent coordination
  • No workflow state machine

Needed:

{
"workflow": {
"id": "workflow-oauth2-impl",
"type": "feature_implementation",
"stages": [
{
"name": "planning",
"agent": "orchestrator",
"started": "2025-11-17T14:00:00Z",
"completed": "2025-11-17T14:30:00Z",
"output": "Implementation plan with 5 tasks"
},
{
"name": "implementation",
"agent": "rust-expert-developer",
"started": "2025-11-17T14:30:00Z",
"status": "in_progress",
"handed_off_from": "orchestrator"
}
]
}
}

Impact: CRITICAL for multi-agent

  • Can't visualize agent workflows
  • Can't optimize agent coordination
  • Can't detect handoff failures

Implementation needed:

  • Workflow state machine
  • Agent handoff metadata
  • Workflow orchestration tracking

Gap #7: Effort & Time Tracking

Problem: No time/effort metadata

Current state:

  • Timestamps exist but no duration
  • No effort estimates vs actuals
  • Can't measure productivity

Needed:

{
"task": {
"id": "TASK-003",
"estimated_hours": 4,
"actual_hours": 5.5,
"started_at": "2025-11-17T14:00:00Z",
"completed_at": "2025-11-17T19:30:00Z",
"time_breakdown": {
"planning": 0.5,
"implementation": 3.5,
"testing": 1.0,
"debugging": 0.5
}
}
}

Impact: MEDIUM

  • Can't predict future work
  • Can't measure efficiency
  • Can't identify bottlenecks

Implementation needed:

  • Time tracking integration
  • Effort estimation framework
  • Activity classification

Gap #8: File → Task → Goal Mapping

Problem: Must infer relationships via fuzzy matching

Current state:

  • File changes not explicitly linked to tasks
  • Tasks not explicitly linked to goals
  • Weak correlation (60-70% accuracy)

Needed:

{
"file": "src/middleware/oauth2.rs",
"linked_tasks": ["TASK-003", "TASK-004"],
"linked_goals": ["Setup authentication system"],
"linked_conversations": ["msg-785", "msg-786", "msg-789"]
}

Impact: HIGH

  • Impact analysis inaccurate
  • Task completion detection weak
  • Can't auto-update TASKLIST

Implementation needed:

  • Explicit metadata in git commits
  • Task ID references in commit messages
  • Automated linkage on file save

Proposed Metadata Schema

Enhanced Message Format

{
"message_id": "msg-1234",
"hash": "content-hash",
"session_id": "session-2025-11-17-oauth2",
"timestamp": "2025-11-17T15:30:00Z",
"parent_message_id": "msg-1233",

"actor": {
"type": "human" | "ai_agent",
"name": "Hal Casteel" | "rust-expert-developer",
"email": "hal@az1.ai"
},

"content": {
"role": "user" | "assistant",
"text": "Implementing OAuth2 middleware",
"intent": {
"activity": "implementing",
"feature": "OAuth2 middleware",
"component": "authentication"
}
},

"context": {
"project": "coditect-cloud-backend",
"session_type": "implementation",
"active_files": ["src/middleware/oauth2.rs"],
"linked_task": "TASK-003",
"linked_goal": "Setup authentication system",
"phase": "Phase 1: Foundation"
},

"causality": {
"caused_by": ["msg-1230", "msg-1231"],
"caused": [
{"type": "git_commit", "hash": "abc123"},
{"type": "file_edit", "path": "src/middleware/oauth2.rs"}
]
},

"checkpoint": "checkpoint-id"
}

Enhanced Task Format

{
"task_id": "TASK-003",
"title": "Implement OAuth2 middleware",
"description": "Add OAuth2 token validation middleware",

"status": "completed",
"created_at": "2025-11-15T10:00:00Z",
"started_at": "2025-11-17T14:00:00Z",
"completed_at": "2025-11-17T19:30:00Z",

"owner": {
"type": "human",
"name": "Hal Casteel"
},

"agents_used": [
{"name": "rust-expert-developer", "messages": 15}
],

"linked_goal": {
"phase": "Phase 1: Foundation",
"goal": "Setup authentication system"
},

"effort": {
"estimated_hours": 4,
"actual_hours": 5.5
},

"artifacts": {
"files_created": ["src/middleware/oauth2.rs"],
"files_modified": ["src/middleware/mod.rs", "Cargo.toml"],
"commits": ["abc123", "def456"],
"conversations": ["session-2025-11-17-oauth2"]
}
}

Implementation Priority

P0 - Critical (Needed for MVP)

  1. Session ID - Group related messages
  2. Agent Identity - Track which agent
  3. Task IDs - Stable task references
  4. File→Task links - Connect code to tasks

P1 - High Value (Needed for 360° view)

  1. Intent metadata - Activity classification
  2. Causality links - Message threading
  3. Current state - Real-time tracking
  4. Timestamps - Created/started/completed

P2 - Nice to Have (Analytics)

  1. Effort tracking - Time estimates vs actuals
  2. Workflow stages - Multi-agent orchestration
  3. Dependencies - Task/goal relationships

Integration Strategy

Phase 1: Minimal Metadata (Week 1-2)

Add to existing systems without breaking changes:

  • Session ID in exports (file naming convention)
  • Agent detection from content patterns
  • Task ID regex in TASKLIST comments

Phase 2: Enhanced Metadata (Week 3-4)

Structured metadata additions:

  • JSON front-matter in tasklist.md
  • Git commit message format standardization
  • Session metadata in checkpoints

Phase 3: Full Schema (Week 5-6)

Complete metadata integration:

  • Enhanced message format
  • Real-time state tracking
  • Causality graph database

Success Metrics

After metadata enhancements:

  • ✅ 95%+ agent identification accuracy
  • ✅ 90%+ file→task linkage accuracy
  • ✅ 100% session grouping coverage
  • ✅ <5s state reconstruction time
  • ✅ Complete causality tracing

Appendix: Quick Wins

Quick Win #1: Agent Detection (2 hours)

Use regex patterns to detect agents from content:

agent_patterns = {
'rust-expert-developer': r'async.*Rust|tokio|actix',
'database-architect': r'PostgreSQL|schema.*design'
}

Quick Win #2: Session from Filename (30 minutes)

Extract session context from export filenames:

"2025-11-17-EXPORT-oauth2-implementation.txt"
→ session_id: "2025-11-17-oauth2-implementation"

Quick Win #3: Task ID in Comments (1 hour)

Add task IDs to TASKLIST:

- [ ] Implement OAuth2 middleware  <!-- TASK-003 -->

Quick Win #4: Commit Message Format (1 hour)

Standardize git commits:

feat(auth): Add OAuth2 middleware

Task: TASK-003
Phase: Phase 1 - Foundation
Files: src/middleware/oauth2.rs

Next Steps:

  1. Review metadata gaps with team
  2. Prioritize P0 items for MVP
  3. Implement quick wins (4.5 hours total)
  4. Design full metadata schema
  5. Roll out incrementally