Skip to main content

Skill

DAAO Routing Skill

Metadata

name: daao-routing
version: 1.0.0
category: orchestration
status: active
priority: P1
derived_from: Claude Operating Preferences v6.0

When to Use This Skill

Use this skill when:

  • Routing tasks to optimal model tiers (Haiku/Sonnet/Opus)
  • Optimizing cost vs quality tradeoffs
  • Building heterogeneous multi-model workflows
  • Estimating task difficulty programmatically

Core Pattern: Difficulty-Aware Agent Orchestration (DAAO)

Key Insight: 11.21% accuracy improvement at 64% of baseline cost through intelligent model selection.

from dataclasses import dataclass
from enum import Enum
from typing import Dict, List, Optional

class ModelTier(Enum):
HAIKU = "claude-haiku-4-5-20251001"
SONNET = "claude-sonnet-4-5-20251022"
OPUS = "claude-opus-4-5-20251101"

# Model pricing per million tokens
MODEL_PRICING = {
ModelTier.HAIKU: {"input": 0.80, "output": 4.00},
ModelTier.SONNET: {"input": 3.00, "output": 15.00},
ModelTier.OPUS: {"input": 5.00, "output": 25.00},
}

@dataclass
class DifficultyEstimate:
"""Result of difficulty estimation"""
score: float # 0.0-1.0
factors: Dict[str, float]
recommended_model: ModelTier
confidence: float
rationale: str

class DifficultyAwareOrchestrator:
"""Route tasks by difficulty to optimal model"""

DIFFICULTY_FACTORS = {
"reasoning_depth": 0.25,
"domain_expertise": 0.20,
"context_complexity": 0.15,
"output_precision": 0.15,
"tool_coordination": 0.10,
"error_sensitivity": 0.10,
"novelty": 0.05,
}

def estimate_difficulty(self, task: str) -> DifficultyEstimate:
"""Estimate task difficulty and recommend model"""
factors = self._analyze_factors(task)
score = sum(
factors[k] * self.DIFFICULTY_FACTORS[k]
for k in self.DIFFICULTY_FACTORS
)

if score < 0.3:
model = ModelTier.HAIKU
elif score < 0.7:
model = ModelTier.SONNET
else:
model = ModelTier.OPUS

return DifficultyEstimate(
score=score,
factors=factors,
recommended_model=model,
confidence=0.85,
rationale=self._generate_rationale(factors, model)
)

def _analyze_factors(self, task: str) -> Dict[str, float]:
"""Analyze task for difficulty factors"""
task_lower = task.lower()
factors = {}

# Reasoning depth
reasoning_keywords = ["debug", "analyze", "design", "research"]
factors["reasoning_depth"] = min(
sum(0.25 for k in reasoning_keywords if k in task_lower), 1.0
)

# Domain expertise
domain_keywords = ["architecture", "security", "database", "ml"]
factors["domain_expertise"] = min(
sum(0.25 for k in domain_keywords if k in task_lower), 1.0
)

# Context complexity
factors["context_complexity"] = 0.5 if len(task) > 200 else 0.3

# Output precision
precision_keywords = ["exact", "specific", "format", "schema"]
factors["output_precision"] = min(
sum(0.25 for k in precision_keywords if k in task_lower), 1.0
)

# Tool coordination
tool_keywords = ["file", "search", "api", "database"]
factors["tool_coordination"] = min(
sum(0.25 for k in tool_keywords if k in task_lower), 1.0
)

# Error sensitivity
error_keywords = ["production", "critical", "important"]
factors["error_sensitivity"] = min(
sum(0.33 for k in error_keywords if k in task_lower), 1.0
)

# Novelty
factors["novelty"] = 0.3 # Default moderate novelty

return factors

def _generate_rationale(
self,
factors: Dict[str, float],
model: ModelTier
) -> str:
top_factors = sorted(
factors.items(), key=lambda x: x[1], reverse=True
)[:3]
factor_str = ", ".join(f"{k}={v:.2f}" for k, v in top_factors)
return f"Top factors: {factor_str}. Model: {model.name}"

Heterogeneous Workflow Builder

@dataclass
class WorkflowStep:
"""Single step in heterogeneous workflow"""
task: str
model: ModelTier
depends_on: List[str] = field(default_factory=list)
estimated_cost: float = 0.0

class HeterogeneousWorkflowBuilder:
"""Build workflows mixing model tiers"""

def __init__(self, orchestrator: DifficultyAwareOrchestrator):
self.orchestrator = orchestrator

def build_workflow(
self,
tasks: List[str],
dependencies: Optional[Dict[str, List[str]]] = None
) -> List[WorkflowStep]:
"""Build optimized workflow from task list"""
steps = []
dependencies = dependencies or {}

for task in tasks:
estimate = self.orchestrator.estimate_difficulty(task)
step = WorkflowStep(
task=task,
model=estimate.recommended_model,
depends_on=dependencies.get(task, []),
estimated_cost=self._estimate_cost(estimate.recommended_model)
)
steps.append(step)

return steps

def _estimate_cost(self, model: ModelTier) -> float:
"""Estimate cost for typical task"""
pricing = MODEL_PRICING[model]
# Assume 1K input, 2K output typical
return (pricing["input"] * 1 + pricing["output"] * 2) / 1000

Usage Example

orchestrator = DifficultyAwareOrchestrator()

# Simple task -> Haiku
estimate = orchestrator.estimate_difficulty("Format this JSON data")
# score=0.15, model=HAIKU

# Medium task -> Sonnet
estimate = orchestrator.estimate_difficulty("Generate unit tests for auth module")
# score=0.52, model=SONNET

# Complex task -> Opus
estimate = orchestrator.estimate_difficulty("Debug the distributed race condition in production payment system")
# score=0.78, model=OPUS

Success Output

When this skill is successfully applied, you should see:

✅ SKILL COMPLETE: daao-routing

Completed:
- [x] Task difficulty estimated with confidence score
- [x] Optimal model tier selected (Haiku/Sonnet/Opus)
- [x] Cost savings calculated vs. baseline (all-Opus)
- [x] Heterogeneous workflow built with dependencies
- [x] Model routing decisions documented with rationale

Outputs:
- DifficultyEstimate object with score (0.0-1.0)
- Recommended model tier with justification
- Cost estimate per task (input + output tokens)
- Workflow steps with model assignments
- Performance metrics (accuracy improvement, cost reduction)

Example Output:
Task: "Debug distributed race condition in payment system"
- Difficulty Score: 0.78 (HIGH)
- Top Factors: reasoning_depth=0.95, domain_expertise=0.85, error_sensitivity=0.90
- Recommended Model: OPUS
- Estimated Cost: $0.03 (1K input, 2K output)
- Rationale: Complex debugging requires advanced reasoning capabilities

Completion Checklist

Before marking this skill as complete, verify:

  • Difficulty factors analyzed for task (7 dimensions scored)
  • Difficulty score calculated (0.0-1.0 scale)
  • Model tier selected based on thresholds (<0.3=Haiku, 0.3-0.7=Sonnet, >0.7=Opus)
  • Cost estimate provided with input/output token breakdown
  • Rationale generated explaining top 3 contributing factors
  • Confidence score calculated (typically 0.85)
  • Alternative models considered if edge case (score near threshold)
  • Workflow dependencies respected (prerequisite tasks completed first)
  • Performance metrics tracked (accuracy, cost, latency)

Failure Indicators

This skill has FAILED if:

  • ❌ Model selection doesn't match difficulty (trivial task routed to Opus)
  • ❌ No rationale provided for model choice
  • ❌ Cost estimate missing or inaccurate
  • ❌ Difficulty factors not analyzed (default values used)
  • ❌ Workflow dependencies ignored (tasks run out of order)
  • ❌ No fallback handling (circuit breaker open, no graceful degradation)
  • ❌ Confidence score <0.7 without flagging uncertainty
  • ❌ No performance tracking (can't measure actual vs. estimated difficulty)

When NOT to Use

Do NOT use this skill when:

  • Single-model workflows - If all tasks require Opus-level reasoning, routing adds overhead
  • Latency-critical paths - Difficulty estimation adds 50-100ms overhead
  • Uniform task difficulty - Batch of identical tasks should use same model
  • Non-LLM operations - Don't route database queries or API calls through DAAO
  • Cost is irrelevant - If budget unlimited, just use Opus for everything
  • Offline/local models - DAAO assumes cloud API pricing (Haiku/Sonnet/Opus)
  • Streaming responses - Difficulty must be known before starting stream

Use alternatives:

  • Uniform difficulty → Direct model selection without DAAO
  • Latency-critical → Pre-compute difficulty scores, cache routing decisions
  • Local models → Use model size (7B/13B/70B) instead of tier-based routing
  • Streaming → Estimate difficulty from first N tokens, route mid-stream if needed

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Always Opus3x cost, same accuracy for simple tasksUse DAAO to route simple tasks to Haiku (e.g., "Format JSON")
Always HaikuPoor accuracy on complex tasks, retry overheadRoute high-difficulty tasks to Sonnet/Opus upfront
No difficulty trackingCan't validate routing accuracy over timeLog actual vs. estimated difficulty, tune thresholds
Ignoring confidenceLow confidence (<0.7) treated same as highFlag low-confidence estimates for human review or default to Sonnet
Static thresholds<0.3 Haiku, >0.7 Opus may not be optimalA/B test thresholds, adjust based on accuracy/cost metrics
No circuit breakerWhen Opus unavailable, system fails completelyGracefully degrade to Sonnet, then Haiku with retry logic
Over-engineering simple flows3-task workflow doesn't need heterogeneous routingUse DAAO only when workflow has 5+ tasks with varied difficulty
Blind reliance on keywords"debug" doesn't always mean high difficultyCombine keyword analysis with context length, domain signals

Principles

This skill embodies CODITECT automation principles:

#1 Recycle → Extend → Re-Use → Create

  • Recycle routing decisions - Cache difficulty scores for similar tasks
  • Extend factor weights - Adjust weights based on performance data
  • Re-use workflow templates - Common patterns (research → implement → test)
  • Create custom factors - Domain-specific difficulty signals when needed

#2 First Principles Thinking

  • Cost-quality tradeoff - Understand marginal utility of Opus vs. Sonnet
  • Difficulty dimensions - Break down task complexity into measurable factors
  • Model capabilities - Know what each tier excels at (Haiku=speed, Opus=reasoning)

#5 Eliminate Ambiguity

  • Explicit difficulty score - 0.78 is unambiguous, "complex" is vague
  • Rationale required - Always explain why model was selected
  • Confidence thresholds - Define what LOW (<0.7) vs. HIGH (>0.85) means

#6 Clear, Understandable, Explainable

  • Factor breakdown - Show top 3 contributing factors for transparency
  • Cost transparency - Display estimated cost per task upfront
  • Workflow visualization - Diagram showing model assignments across steps

#8 No Assumptions

  • Validate routing decisions - Track actual task outcomes to tune difficulty estimation
  • Don't assume static pricing - Update MODEL_PRICING when Anthropic changes rates
  • Test edge cases - Tasks scoring exactly 0.3 or 0.7 need clear assignment rules

#10 Automation First

  • Auto-route all tasks - No manual model selection in production
  • Auto-tune thresholds - Use performance data to adjust <0.3, >0.7 boundaries
  • Auto-scale workflows - DAAO enables cost-efficient parallelization
  • agents/difficulty-aware-orchestrator.md - Agent implementation
  • commands/daao-route.md - CLI command
  • scripts/core/agent_dispatcher.py - Integration point