Skip to main content

ADR-072: Execution Tracking

Status

Accepted - January 13, 2026

Context

Every task execution must be tracked for:

  1. Audit: Who executed what, when, with what result
  2. Metrics: Duration, token usage, success rates
  3. Learning: Data for affinity score improvement
  4. Debugging: Failure analysis, retry patterns

Decision

Execution Record Schema

CREATE TABLE task_executions (
id UUID PRIMARY KEY,
task_id VARCHAR(20) REFERENCES atomic_tasks(task_id),
agent_name VARCHAR(100) NOT NULL,

-- Timing
started_at TIMESTAMP NOT NULL,
completed_at TIMESTAMP,
duration_ms INTEGER,

-- Outcome
outcome VARCHAR(20) CHECK (outcome IN
('success', 'partial', 'failure', 'timeout', 'aborted')),

-- Context
session_id VARCHAR(100), -- Claude Code session
machine_id VARCHAR(100), -- From machine-id.json
model_used VARCHAR(50), -- e.g., 'claude-opus-4.5'

-- Metrics
tokens_input INTEGER,
tokens_output INTEGER,
retry_number INTEGER DEFAULT 0,

-- Output
artifacts_created TEXT[],
error_message TEXT,
execution_log JSONB,

created_at TIMESTAMP DEFAULT NOW()
);

Execution Lifecycle

┌─────────┐    start    ┌────────────┐   complete   ┌───────────┐
│ READY │────────────▶│IN_PROGRESS │─────────────▶│ COMPLETED │
└─────────┘ └────────────┘ └───────────┘
│ │
│ timeout/abort │ (any outcome)
▼ ▼
┌────────────┐ ┌───────────┐
│ ABORTED │ │ METRICS │
└────────────┘ │ UPDATED │
└───────────┘

Metrics Collection

On completion, update aggregates:

def record_execution(execution: TaskExecution) -> None:
# 1. Insert execution record
db.insert(execution)

# 2. Update task status
task = get_task(execution.task_id)
task.status = 'completed' if execution.outcome == 'success' else 'failed'
task.completed_at = execution.completed_at
task.execution_duration_ms = execution.duration_ms
task.outcome = execution.outcome
db.update(task)

# 3. Update affinity scores (triggers learning loop)
update_agent_affinity(execution)

Circuit Breaker Integration

Track failures for circuit breaker:

# Per-agent failure tracking
if execution.outcome in ('failure', 'timeout'):
circuit = get_circuit(execution.agent_name)
circuit.record_failure()
elif execution.outcome == 'success':
circuit = get_circuit(execution.agent_name)
circuit.record_success()

Consequences

Positive

  • Complete audit trail for compliance
  • Rich data for performance analysis
  • Enables learning feedback loop
  • Supports retry decisions

Negative

  • Storage growth (mitigate with retention policy)
  • Write amplification on completion

References

Local:

External (coditect-core):

  • ADR-068: Large Project Plan Token Economics

Decision Date: January 13, 2026 Implementation Priority: P1 (Core feature)