Skill

DAAO Routing Skill

Metadata

name: daao-routing
version: 1.0.0
category: orchestration
status: active
priority: P1
derived_from: Claude Operating Preferences v6.0

When to Use This Skill

Use this skill when:

Routing tasks to optimal model tiers (Haiku/Sonnet/Opus)
Optimizing cost vs quality tradeoffs
Building heterogeneous multi-model workflows
Estimating task difficulty programmatically

Core Pattern: Difficulty-Aware Agent Orchestration (DAAO)

Key Insight: 11.21% accuracy improvement at 64% of baseline cost through intelligent model selection.

from dataclasses import dataclass
from enum import Enum
from typing import Dict, List, Optional

class ModelTier(Enum):
    HAIKU = "claude-haiku-4-5-20251001"
    SONNET = "claude-sonnet-4-5-20251022"
    OPUS = "claude-opus-4-5-20251101"

# Model pricing per million tokens
MODEL_PRICING = {
    ModelTier.HAIKU: {"input": 0.80, "output": 4.00},
    ModelTier.SONNET: {"input": 3.00, "output": 15.00},
    ModelTier.OPUS: {"input": 5.00, "output": 25.00},
}

@dataclass
class DifficultyEstimate:
    """Result of difficulty estimation"""
    score: float  # 0.0-1.0
    factors: Dict[str, float]
    recommended_model: ModelTier
    confidence: float
    rationale: str

class DifficultyAwareOrchestrator:
    """Route tasks by difficulty to optimal model"""

    DIFFICULTY_FACTORS = {
        "reasoning_depth": 0.25,
        "domain_expertise": 0.20,
        "context_complexity": 0.15,
        "output_precision": 0.15,
        "tool_coordination": 0.10,
        "error_sensitivity": 0.10,
        "novelty": 0.05,
    }

    def estimate_difficulty(self, task: str) -> DifficultyEstimate:
        """Estimate task difficulty and recommend model"""
        factors = self._analyze_factors(task)
        score = sum(
            factors[k] * self.DIFFICULTY_FACTORS[k]
            for k in self.DIFFICULTY_FACTORS
        )

        if score < 0.3:
            model = ModelTier.HAIKU
        elif score < 0.7:
            model = ModelTier.SONNET
        else:
            model = ModelTier.OPUS

        return DifficultyEstimate(
            score=score,
            factors=factors,
            recommended_model=model,
            confidence=0.85,
            rationale=self._generate_rationale(factors, model)
        )

    def _analyze_factors(self, task: str) -> Dict[str, float]:
        """Analyze task for difficulty factors"""
        task_lower = task.lower()
        factors = {}

        # Reasoning depth
        reasoning_keywords = ["debug", "analyze", "design", "research"]
        factors["reasoning_depth"] = min(
            sum(0.25 for k in reasoning_keywords if k in task_lower), 1.0
        )

        # Domain expertise
        domain_keywords = ["architecture", "security", "database", "ml"]
        factors["domain_expertise"] = min(
            sum(0.25 for k in domain_keywords if k in task_lower), 1.0
        )

        # Context complexity
        factors["context_complexity"] = 0.5 if len(task) > 200 else 0.3

        # Output precision
        precision_keywords = ["exact", "specific", "format", "schema"]
        factors["output_precision"] = min(
            sum(0.25 for k in precision_keywords if k in task_lower), 1.0
        )

        # Tool coordination
        tool_keywords = ["file", "search", "api", "database"]
        factors["tool_coordination"] = min(
            sum(0.25 for k in tool_keywords if k in task_lower), 1.0
        )

        # Error sensitivity
        error_keywords = ["production", "critical", "important"]
        factors["error_sensitivity"] = min(
            sum(0.33 for k in error_keywords if k in task_lower), 1.0
        )

        # Novelty
        factors["novelty"] = 0.3  # Default moderate novelty

        return factors

    def _generate_rationale(
        self,
        factors: Dict[str, float],
        model: ModelTier
    ) -> str:
        top_factors = sorted(
            factors.items(), key=lambda x: x[1], reverse=True
        )[:3]
        factor_str = ", ".join(f"{k}={v:.2f}" for k, v in top_factors)
        return f"Top factors: {factor_str}. Model: {model.name}"

Heterogeneous Workflow Builder

@dataclass
class WorkflowStep:
    """Single step in heterogeneous workflow"""
    task: str
    model: ModelTier
    depends_on: List[str] = field(default_factory=list)
    estimated_cost: float = 0.0

class HeterogeneousWorkflowBuilder:
    """Build workflows mixing model tiers"""

    def __init__(self, orchestrator: DifficultyAwareOrchestrator):
        self.orchestrator = orchestrator

    def build_workflow(
        self,
        tasks: List[str],
        dependencies: Optional[Dict[str, List[str]]] = None
    ) -> List[WorkflowStep]:
        """Build optimized workflow from task list"""
        steps = []
        dependencies = dependencies or {}

        for task in tasks:
            estimate = self.orchestrator.estimate_difficulty(task)
            step = WorkflowStep(
                task=task,
                model=estimate.recommended_model,
                depends_on=dependencies.get(task, []),
                estimated_cost=self._estimate_cost(estimate.recommended_model)
            )
            steps.append(step)

        return steps

    def _estimate_cost(self, model: ModelTier) -> float:
        """Estimate cost for typical task"""
        pricing = MODEL_PRICING[model]
        # Assume 1K input, 2K output typical
        return (pricing["input"] * 1 + pricing["output"] * 2) / 1000

Usage Example

orchestrator = DifficultyAwareOrchestrator()

# Simple task -> Haiku
estimate = orchestrator.estimate_difficulty("Format this JSON data")
# score=0.15, model=HAIKU

# Medium task -> Sonnet
estimate = orchestrator.estimate_difficulty("Generate unit tests for auth module")
# score=0.52, model=SONNET

# Complex task -> Opus
estimate = orchestrator.estimate_difficulty("Debug the distributed race condition in production payment system")
# score=0.78, model=OPUS

Success Output

When this skill is successfully applied, you should see:

✅ SKILL COMPLETE: daao-routing

Completed:
- [x] Task difficulty estimated with confidence score
- [x] Optimal model tier selected (Haiku/Sonnet/Opus)
- [x] Cost savings calculated vs. baseline (all-Opus)
- [x] Heterogeneous workflow built with dependencies
- [x] Model routing decisions documented with rationale

Outputs:
- DifficultyEstimate object with score (0.0-1.0)
- Recommended model tier with justification
- Cost estimate per task (input + output tokens)
- Workflow steps with model assignments
- Performance metrics (accuracy improvement, cost reduction)

Example Output:
Task: "Debug distributed race condition in payment system"
- Difficulty Score: 0.78 (HIGH)
- Top Factors: reasoning_depth=0.95, domain_expertise=0.85, error_sensitivity=0.90
- Recommended Model: OPUS
- Estimated Cost: $0.03 (1K input, 2K output)
- Rationale: Complex debugging requires advanced reasoning capabilities

Completion Checklist

Before marking this skill as complete, verify:

Difficulty factors analyzed for task (7 dimensions scored)
Difficulty score calculated (0.0-1.0 scale)
Model tier selected based on thresholds (<0.3=Haiku, 0.3-0.7=Sonnet, >0.7=Opus)
Cost estimate provided with input/output token breakdown
Rationale generated explaining top 3 contributing factors
Confidence score calculated (typically 0.85)
Alternative models considered if edge case (score near threshold)
Workflow dependencies respected (prerequisite tasks completed first)
Performance metrics tracked (accuracy, cost, latency)

Failure Indicators

This skill has FAILED if:

❌ Model selection doesn't match difficulty (trivial task routed to Opus)
❌ No rationale provided for model choice
❌ Cost estimate missing or inaccurate
❌ Difficulty factors not analyzed (default values used)
❌ Workflow dependencies ignored (tasks run out of order)
❌ No fallback handling (circuit breaker open, no graceful degradation)
❌ Confidence score <0.7 without flagging uncertainty
❌ No performance tracking (can't measure actual vs. estimated difficulty)

When NOT to Use

Do NOT use this skill when:

Single-model workflows - If all tasks require Opus-level reasoning, routing adds overhead
Latency-critical paths - Difficulty estimation adds 50-100ms overhead
Uniform task difficulty - Batch of identical tasks should use same model
Non-LLM operations - Don't route database queries or API calls through DAAO
Cost is irrelevant - If budget unlimited, just use Opus for everything
Offline/local models - DAAO assumes cloud API pricing (Haiku/Sonnet/Opus)
Streaming responses - Difficulty must be known before starting stream

Use alternatives:

Uniform difficulty → Direct model selection without DAAO
Latency-critical → Pre-compute difficulty scores, cache routing decisions
Local models → Use model size (7B/13B/70B) instead of tier-based routing
Streaming → Estimate difficulty from first N tokens, route mid-stream if needed

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Always Opus	3x cost, same accuracy for simple tasks	Use DAAO to route simple tasks to Haiku (e.g., "Format JSON")
Always Haiku	Poor accuracy on complex tasks, retry overhead	Route high-difficulty tasks to Sonnet/Opus upfront
No difficulty tracking	Can't validate routing accuracy over time	Log actual vs. estimated difficulty, tune thresholds
Ignoring confidence	Low confidence (<0.7) treated same as high	Flag low-confidence estimates for human review or default to Sonnet
Static thresholds	<0.3 Haiku, >0.7 Opus may not be optimal	A/B test thresholds, adjust based on accuracy/cost metrics
No circuit breaker	When Opus unavailable, system fails completely	Gracefully degrade to Sonnet, then Haiku with retry logic
Over-engineering simple flows	3-task workflow doesn't need heterogeneous routing	Use DAAO only when workflow has 5+ tasks with varied difficulty
Blind reliance on keywords	"debug" doesn't always mean high difficulty	Combine keyword analysis with context length, domain signals

Principles

This skill embodies CODITECT automation principles:

#1 Recycle → Extend → Re-Use → Create

Recycle routing decisions - Cache difficulty scores for similar tasks
Extend factor weights - Adjust weights based on performance data
Re-use workflow templates - Common patterns (research → implement → test)
Create custom factors - Domain-specific difficulty signals when needed

#2 First Principles Thinking

Cost-quality tradeoff - Understand marginal utility of Opus vs. Sonnet
Difficulty dimensions - Break down task complexity into measurable factors
Model capabilities - Know what each tier excels at (Haiku=speed, Opus=reasoning)

#5 Eliminate Ambiguity

Explicit difficulty score - 0.78 is unambiguous, "complex" is vague
Rationale required - Always explain why model was selected
Confidence thresholds - Define what LOW (<0.7) vs. HIGH (>0.85) means

#6 Clear, Understandable, Explainable

Factor breakdown - Show top 3 contributing factors for transparency
Cost transparency - Display estimated cost per task upfront
Workflow visualization - Diagram showing model assignments across steps

#8 No Assumptions

Validate routing decisions - Track actual task outcomes to tune difficulty estimation
Don't assume static pricing - Update MODEL_PRICING when Anthropic changes rates
Test edge cases - Tasks scoring exactly 0.3 or 0.7 need clear assignment rules

#10 Automation First

Auto-route all tasks - No manual model selection in production
Auto-tune thresholds - Use performance data to adjust <0.3, >0.7 boundaries
Auto-scale workflows - DAAO enables cost-efficient parallelization

agents/difficulty-aware-orchestrator.md - Agent implementation
commands/daao-route.md - CLI command
scripts/core/agent_dispatcher.py - Integration point

Metadata​

When to Use This Skill​

Core Pattern: Difficulty-Aware Agent Orchestration (DAAO)​

Heterogeneous Workflow Builder​

Usage Example​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​

#1 Recycle → Extend → Re-Use → Create​

#2 First Principles Thinking​

#5 Eliminate Ambiguity​

#6 Clear, Understandable, Explainable​

#8 No Assumptions​

#10 Automation First​

Related Components​