Skip to main content

ADR-073: Learning Feedback Loop

Status

Accepted - January 13, 2026

Context

Agent-task affinity scores (ADR-071) should improve over time based on actual execution outcomes. This requires:

  1. Tracking outcomes per agent/task-type/domain
  2. Updating scores without overfitting to recent results
  3. Handling cold-start for new agents/tasks

Decision

Exponential Moving Average (EMA)

Use EMA with alpha=0.3 for smooth learning that balances recent and historical performance:

def update_affinity_ema(
current_value: float,
new_observation: float,
alpha: float = 0.3
) -> float:
"""
EMA update formula: new_ema = alpha * observation + (1 - alpha) * current

alpha=0.3 means:
- 30% weight on new observation
- 70% weight on existing average
- Half-life of ~2 observations
"""
if current_value is None:
return new_observation
return alpha * new_observation + (1 - alpha) * current_value

Affinity Update Process

def update_agent_affinity(execution: TaskExecution) -> None:
task = get_task(execution.task_id)

# Get or create affinity record
affinity = get_or_create_affinity(
agent_name=execution.agent_name,
task_type=task.task_type,
domain=task.domain
)

# Update execution counts
affinity.total_executions += 1
if execution.outcome == 'success':
affinity.successful_executions += 1

# Update duration EMA
if execution.duration_ms:
affinity.avg_duration_ms = update_affinity_ema(
affinity.avg_duration_ms,
execution.duration_ms
)

# Update token usage EMA
total_tokens = (execution.tokens_input or 0) + (execution.tokens_output or 0)
if total_tokens:
affinity.avg_token_usage = update_affinity_ema(
affinity.avg_token_usage,
total_tokens
)

# Recalculate composite affinity score
affinity.affinity_score = calculate_base_affinity(affinity)
affinity.last_updated = datetime.now()

save_affinity(affinity)

Composite Affinity Calculation

def calculate_base_affinity(affinity: AgentTaskAffinity) -> float:
"""
Calculate base affinity from historical metrics.
Used in weighted combination with semantic similarity.
"""
if affinity.total_executions < 3:
return 0.5 # Neutral score for insufficient data

# Success rate is primary factor
success_weight = 0.6

# Efficiency factor (faster = better, capped)
efficiency = 1.0
if affinity.avg_duration_ms:
# Normalize to 0-1 range, assuming 10min is baseline
efficiency = min(1.0, 600000 / affinity.avg_duration_ms)
efficiency_weight = 0.2

# Token efficiency (fewer = better, capped)
token_efficiency = 1.0
if affinity.avg_token_usage:
# Normalize, assuming 10K tokens is baseline
token_efficiency = min(1.0, 10000 / affinity.avg_token_usage)
token_weight = 0.2

return (
affinity.success_rate * success_weight +
efficiency * efficiency_weight +
token_efficiency * token_weight
)

Cold Start Handling

For agents/tasks with insufficient data:

def get_effective_affinity(agent: str, task: AtomicTask) -> float:
affinity = get_affinity(agent, task.task_type, task.domain)

if affinity is None or affinity.total_executions < 3:
# Fall back to domain expertise
return get_domain_expertise_score(agent, task.domain)

return affinity.affinity_score

Learning Dashboard

Expose learning metrics via API:

@app.get("/agents/{agent}/learning")
def get_agent_learning_stats(agent: str):
affinities = get_all_affinities(agent)

return {
"agent": agent,
"total_task_types": len(affinities),
"affinities": [
{
"task_type": a.task_type,
"domain": a.domain,
"success_rate": a.success_rate,
"total_executions": a.total_executions,
"affinity_score": a.affinity_score,
"trend": calculate_trend(a) # Improving/stable/declining
}
for a in affinities
]
}

Consequences

Positive

  • Continuous improvement without manual intervention
  • Balances recent and historical performance
  • Graceful handling of new agents/tasks
  • Observable learning via dashboard

Negative

  • Requires minimum executions for reliable scores
  • May perpetuate bias (successful agents get more tasks)

Mitigation

  • Exploration factor: 10% of tasks assigned to non-optimal agents
  • Periodic affinity decay to prevent staleness

References

Local:

External (coditect-core):

  • ADR-068: Large Project Plan Token Economics

Decision Date: January 13, 2026 Implementation Priority: P2 (After core execution tracking)