Skill
DAAO Routing Skill
Metadata
name: daao-routing
version: 1.0.0
category: orchestration
status: active
priority: P1
derived_from: Claude Operating Preferences v6.0
When to Use This Skill
Use this skill when:
- Routing tasks to optimal model tiers (Haiku/Sonnet/Opus)
- Optimizing cost vs quality tradeoffs
- Building heterogeneous multi-model workflows
- Estimating task difficulty programmatically
Core Pattern: Difficulty-Aware Agent Orchestration (DAAO)
Key Insight: 11.21% accuracy improvement at 64% of baseline cost through intelligent model selection.
from dataclasses import dataclass
from enum import Enum
from typing import Dict, List, Optional
class ModelTier(Enum):
HAIKU = "claude-haiku-4-5-20251001"
SONNET = "claude-sonnet-4-5-20251022"
OPUS = "claude-opus-4-5-20251101"
# Model pricing per million tokens
MODEL_PRICING = {
ModelTier.HAIKU: {"input": 0.80, "output": 4.00},
ModelTier.SONNET: {"input": 3.00, "output": 15.00},
ModelTier.OPUS: {"input": 5.00, "output": 25.00},
}
@dataclass
class DifficultyEstimate:
"""Result of difficulty estimation"""
score: float # 0.0-1.0
factors: Dict[str, float]
recommended_model: ModelTier
confidence: float
rationale: str
class DifficultyAwareOrchestrator:
"""Route tasks by difficulty to optimal model"""
DIFFICULTY_FACTORS = {
"reasoning_depth": 0.25,
"domain_expertise": 0.20,
"context_complexity": 0.15,
"output_precision": 0.15,
"tool_coordination": 0.10,
"error_sensitivity": 0.10,
"novelty": 0.05,
}
def estimate_difficulty(self, task: str) -> DifficultyEstimate:
"""Estimate task difficulty and recommend model"""
factors = self._analyze_factors(task)
score = sum(
factors[k] * self.DIFFICULTY_FACTORS[k]
for k in self.DIFFICULTY_FACTORS
)
if score < 0.3:
model = ModelTier.HAIKU
elif score < 0.7:
model = ModelTier.SONNET
else:
model = ModelTier.OPUS
return DifficultyEstimate(
score=score,
factors=factors,
recommended_model=model,
confidence=0.85,
rationale=self._generate_rationale(factors, model)
)
def _analyze_factors(self, task: str) -> Dict[str, float]:
"""Analyze task for difficulty factors"""
task_lower = task.lower()
factors = {}
# Reasoning depth
reasoning_keywords = ["debug", "analyze", "design", "research"]
factors["reasoning_depth"] = min(
sum(0.25 for k in reasoning_keywords if k in task_lower), 1.0
)
# Domain expertise
domain_keywords = ["architecture", "security", "database", "ml"]
factors["domain_expertise"] = min(
sum(0.25 for k in domain_keywords if k in task_lower), 1.0
)
# Context complexity
factors["context_complexity"] = 0.5 if len(task) > 200 else 0.3
# Output precision
precision_keywords = ["exact", "specific", "format", "schema"]
factors["output_precision"] = min(
sum(0.25 for k in precision_keywords if k in task_lower), 1.0
)
# Tool coordination
tool_keywords = ["file", "search", "api", "database"]
factors["tool_coordination"] = min(
sum(0.25 for k in tool_keywords if k in task_lower), 1.0
)
# Error sensitivity
error_keywords = ["production", "critical", "important"]
factors["error_sensitivity"] = min(
sum(0.33 for k in error_keywords if k in task_lower), 1.0
)
# Novelty
factors["novelty"] = 0.3 # Default moderate novelty
return factors
def _generate_rationale(
self,
factors: Dict[str, float],
model: ModelTier
) -> str:
top_factors = sorted(
factors.items(), key=lambda x: x[1], reverse=True
)[:3]
factor_str = ", ".join(f"{k}={v:.2f}" for k, v in top_factors)
return f"Top factors: {factor_str}. Model: {model.name}"
Heterogeneous Workflow Builder
@dataclass
class WorkflowStep:
"""Single step in heterogeneous workflow"""
task: str
model: ModelTier
depends_on: List[str] = field(default_factory=list)
estimated_cost: float = 0.0
class HeterogeneousWorkflowBuilder:
"""Build workflows mixing model tiers"""
def __init__(self, orchestrator: DifficultyAwareOrchestrator):
self.orchestrator = orchestrator
def build_workflow(
self,
tasks: List[str],
dependencies: Optional[Dict[str, List[str]]] = None
) -> List[WorkflowStep]:
"""Build optimized workflow from task list"""
steps = []
dependencies = dependencies or {}
for task in tasks:
estimate = self.orchestrator.estimate_difficulty(task)
step = WorkflowStep(
task=task,
model=estimate.recommended_model,
depends_on=dependencies.get(task, []),
estimated_cost=self._estimate_cost(estimate.recommended_model)
)
steps.append(step)
return steps
def _estimate_cost(self, model: ModelTier) -> float:
"""Estimate cost for typical task"""
pricing = MODEL_PRICING[model]
# Assume 1K input, 2K output typical
return (pricing["input"] * 1 + pricing["output"] * 2) / 1000
Usage Example
orchestrator = DifficultyAwareOrchestrator()
# Simple task -> Haiku
estimate = orchestrator.estimate_difficulty("Format this JSON data")
# score=0.15, model=HAIKU
# Medium task -> Sonnet
estimate = orchestrator.estimate_difficulty("Generate unit tests for auth module")
# score=0.52, model=SONNET
# Complex task -> Opus
estimate = orchestrator.estimate_difficulty("Debug the distributed race condition in production payment system")
# score=0.78, model=OPUS
Success Output
When this skill is successfully applied, you should see:
✅ SKILL COMPLETE: daao-routing
Completed:
- [x] Task difficulty estimated with confidence score
- [x] Optimal model tier selected (Haiku/Sonnet/Opus)
- [x] Cost savings calculated vs. baseline (all-Opus)
- [x] Heterogeneous workflow built with dependencies
- [x] Model routing decisions documented with rationale
Outputs:
- DifficultyEstimate object with score (0.0-1.0)
- Recommended model tier with justification
- Cost estimate per task (input + output tokens)
- Workflow steps with model assignments
- Performance metrics (accuracy improvement, cost reduction)
Example Output:
Task: "Debug distributed race condition in payment system"
- Difficulty Score: 0.78 (HIGH)
- Top Factors: reasoning_depth=0.95, domain_expertise=0.85, error_sensitivity=0.90
- Recommended Model: OPUS
- Estimated Cost: $0.03 (1K input, 2K output)
- Rationale: Complex debugging requires advanced reasoning capabilities
Completion Checklist
Before marking this skill as complete, verify:
- Difficulty factors analyzed for task (7 dimensions scored)
- Difficulty score calculated (0.0-1.0 scale)
- Model tier selected based on thresholds (<0.3=Haiku, 0.3-0.7=Sonnet, >0.7=Opus)
- Cost estimate provided with input/output token breakdown
- Rationale generated explaining top 3 contributing factors
- Confidence score calculated (typically 0.85)
- Alternative models considered if edge case (score near threshold)
- Workflow dependencies respected (prerequisite tasks completed first)
- Performance metrics tracked (accuracy, cost, latency)
Failure Indicators
This skill has FAILED if:
- ❌ Model selection doesn't match difficulty (trivial task routed to Opus)
- ❌ No rationale provided for model choice
- ❌ Cost estimate missing or inaccurate
- ❌ Difficulty factors not analyzed (default values used)
- ❌ Workflow dependencies ignored (tasks run out of order)
- ❌ No fallback handling (circuit breaker open, no graceful degradation)
- ❌ Confidence score <0.7 without flagging uncertainty
- ❌ No performance tracking (can't measure actual vs. estimated difficulty)
When NOT to Use
Do NOT use this skill when:
- Single-model workflows - If all tasks require Opus-level reasoning, routing adds overhead
- Latency-critical paths - Difficulty estimation adds 50-100ms overhead
- Uniform task difficulty - Batch of identical tasks should use same model
- Non-LLM operations - Don't route database queries or API calls through DAAO
- Cost is irrelevant - If budget unlimited, just use Opus for everything
- Offline/local models - DAAO assumes cloud API pricing (Haiku/Sonnet/Opus)
- Streaming responses - Difficulty must be known before starting stream
Use alternatives:
- Uniform difficulty → Direct model selection without DAAO
- Latency-critical → Pre-compute difficulty scores, cache routing decisions
- Local models → Use model size (7B/13B/70B) instead of tier-based routing
- Streaming → Estimate difficulty from first N tokens, route mid-stream if needed
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Always Opus | 3x cost, same accuracy for simple tasks | Use DAAO to route simple tasks to Haiku (e.g., "Format JSON") |
| Always Haiku | Poor accuracy on complex tasks, retry overhead | Route high-difficulty tasks to Sonnet/Opus upfront |
| No difficulty tracking | Can't validate routing accuracy over time | Log actual vs. estimated difficulty, tune thresholds |
| Ignoring confidence | Low confidence (<0.7) treated same as high | Flag low-confidence estimates for human review or default to Sonnet |
| Static thresholds | <0.3 Haiku, >0.7 Opus may not be optimal | A/B test thresholds, adjust based on accuracy/cost metrics |
| No circuit breaker | When Opus unavailable, system fails completely | Gracefully degrade to Sonnet, then Haiku with retry logic |
| Over-engineering simple flows | 3-task workflow doesn't need heterogeneous routing | Use DAAO only when workflow has 5+ tasks with varied difficulty |
| Blind reliance on keywords | "debug" doesn't always mean high difficulty | Combine keyword analysis with context length, domain signals |
Principles
This skill embodies CODITECT automation principles:
#1 Recycle → Extend → Re-Use → Create
- Recycle routing decisions - Cache difficulty scores for similar tasks
- Extend factor weights - Adjust weights based on performance data
- Re-use workflow templates - Common patterns (research → implement → test)
- Create custom factors - Domain-specific difficulty signals when needed
#2 First Principles Thinking
- Cost-quality tradeoff - Understand marginal utility of Opus vs. Sonnet
- Difficulty dimensions - Break down task complexity into measurable factors
- Model capabilities - Know what each tier excels at (Haiku=speed, Opus=reasoning)
#5 Eliminate Ambiguity
- Explicit difficulty score - 0.78 is unambiguous, "complex" is vague
- Rationale required - Always explain why model was selected
- Confidence thresholds - Define what LOW (<0.7) vs. HIGH (>0.85) means
#6 Clear, Understandable, Explainable
- Factor breakdown - Show top 3 contributing factors for transparency
- Cost transparency - Display estimated cost per task upfront
- Workflow visualization - Diagram showing model assignments across steps
#8 No Assumptions
- Validate routing decisions - Track actual task outcomes to tune difficulty estimation
- Don't assume static pricing - Update MODEL_PRICING when Anthropic changes rates
- Test edge cases - Tasks scoring exactly 0.3 or 0.7 need clear assignment rules
#10 Automation First
- Auto-route all tasks - No manual model selection in production
- Auto-tune thresholds - Use performance data to adjust <0.3, >0.7 boundaries
- Auto-scale workflows - DAAO enables cost-efficient parallelization
Related Components
agents/difficulty-aware-orchestrator.md- Agent implementationcommands/daao-route.md- CLI commandscripts/core/agent_dispatcher.py- Integration point