Thinking Budget System

Status: Production Version: 1.0.0 Created: 2025-12-22 Category: Cost Optimization

Overview

The Thinking Budget System provides intelligent management of extended thinking token allocation for AI agents. It optimizes costs while maintaining quality by automatically selecting appropriate thinking budgets based on task type, complexity, and agent capabilities.

Key Benefits:

📉 Cost Reduction - Up to 75% cost savings on routine tasks
🎯 Quality Assurance - Ensures adequate thinking for critical tasks
📊 Usage Analytics - Comprehensive tracking and optimization insights
⚙️ Adaptive Budgets - Automatic adjustment based on task complexity
🤖 Agent-Aware - Respects per-agent budget constraints

Quick Start

1. Get Budget Recommendation

# For a specific task
python3 scripts/thinking-budget-manager.py \
  --task code_review \
  --complexity standard

# Output:
# Budget Level: high
# Tokens: 16,384
# Estimated Cost: $0.2518

2. Estimate Task Cost

# Natural language task description
python3 scripts/thinking-budget-manager.py \
  --estimate "comprehensive security audit of authentication system"

# Output:
# Detected Type: security_audit
# Detected Complexity: complex
# Recommended Level: ultrathink
# Estimated Cost: $0.9890

3. Track Usage

# After agent execution
python3 scripts/thinking-budget-manager.py --track \
  --task code_review \
  --budget-level high \
  --tokens-used 12500 \
  --agent qa-specialist

4. View Statistics

# Show comprehensive usage stats
python3 scripts/thinking-budget-manager.py --stats

# JSON output for automation
python3 scripts/thinking-budget-manager.py --stats --json

Budget Levels

The system provides five budget levels optimized for different task types:

Level	Tokens	Use Cases	Cost/Session	Cost Multiplier
ultrathink	65,536	Architecture design, strategic planning, complex system analysis	$0.99	4.0x
high	16,384	QA review, security audit, critical analysis	$0.25	2.0x
medium	4,096	Feature implementation, bug fixes, refactoring	$0.07	1.0x
low	1,024	Formatting, documentation, simple edits	$0.02	0.25x
none	null	Information retrieval, status checks, file reads	$0.00	0.0x

Cost Calculation:

Base model: Claude Sonnet 4.5
Thinking tokens: $0.015 per 1K
Output tokens: $0.003 per 1K (estimated 2K output)

Task-to-Budget Mapping

The system automatically maps task types to appropriate budget levels:

Ultrathink Tasks (65,536 tokens)

When to use:

Multi-system architecture design
Strategic planning and roadmaps
Complex algorithm design
Security architecture review
Cross-component dependency analysis

Task types:

architecture_review
spec_creation
system_design
strategic_planning
dependency_analysis

High Tasks (16,384 tokens)

When to use:

Code quality review
Security audit
Design pattern selection
Performance optimization analysis
Integration planning

Task types:

code_review
qa_review
security_audit
performance_analysis
design_decision

Medium Tasks (4,096 tokens)

When to use:

Feature implementation
Bug fixes
Code refactoring
API design
Test writing

Task types:

feature_implementation
bug_fix
refactoring
api_design
test_writing

Low Tasks (1,024 tokens)

When to use:

Code formatting
Documentation updates
Simple edits
Configuration changes
Typo fixes

Task types:

formatting
documentation
simple_edit
config_change
typo_fix

None Tasks (no extended thinking)

When to use:

Information retrieval
Status checks
Simple queries
File reads

Task types:

info_retrieval
status_check
file_read

Complexity Adjustments

The system automatically adjusts budgets based on task complexity:

Complexity	Adjustment	Example
critical	+2 levels	Production security issue → ultrathink
complex	+1 level	Multi-file refactoring → high
standard	No change	Normal bug fix → medium
simple	-1 level	Single-line fix → low

Example:

# Standard bug fix → medium (4,096 tokens)
python3 scripts/thinking-budget-manager.py --task bug_fix --complexity standard

# Complex bug fix → high (16,384 tokens)
python3 scripts/thinking-budget-manager.py --task bug_fix --complexity complex

# Simple bug fix → low (1,024 tokens)
python3 scripts/thinking-budget-manager.py --task bug_fix --complexity simple

Agent Configuration

Agents can specify thinking budget preferences in their frontmatter:

---
name: qa-specialist
description: Quality assurance expert
tools: Read, Grep, Bash
model: sonnet

thinking_config:
  default_level: high          # Default budget
  max_level: ultrathink        # Maximum allowed
  min_level: medium            # Quality floor
  adaptive: true               # Enable auto-adjustment
  track_usage: true            # Enable analytics
---

Configuration fields:

default_level - Default budget for this agent's tasks
max_level - Maximum budget (cost control)
min_level - Minimum budget (quality assurance)
adaptive - Enable/disable complexity-based adjustment
track_usage - Enable/disable usage tracking

See Agent Frontmatter: Thinking Configuration for complete documentation.

Usage Analytics

View Overall Statistics

python3 scripts/thinking-budget-manager.py --stats

Output:

======================================================================
THINKING BUDGET USAGE STATISTICS
======================================================================

📊 Overall Statistics:
   Total Sessions: 125
   Total Thinking Tokens: 2,450,000
   Total Cost: $38.25
   Avg Tokens/Session: 19,600
   Avg Cost/Session: $0.31

📈 By Budget Level:
   ultrathink   -   12 sessions,   786,432 tokens, $11.8645
   high         -   45 sessions,   737,280 tokens, $11.1412
   medium       -   52 sessions,   213,248 tokens, $3.3237
   low          -   16 sessions,    16,384 tokens, $0.2518

🎯 Top Tasks by Usage:
   code_review               -   45 sessions,   737,280 tokens, $11.1412
   feature_implementation    -   35 sessions,   143,360 tokens, $2.2354
   architecture_review       -   12 sessions,   786,432 tokens, $11.8645
   documentation             -   20 sessions,    20,480 tokens, $0.3144

🤖 By Agent:
   qa-specialist             -   45 sessions,   737,280 tokens, $11.1412
   backend-developer         -   35 sessions,   143,360 tokens, $2.2354
   system-architect          -   12 sessions,   786,432 tokens, $11.8645
   doc-writer                -   20 sessions,    20,480 tokens, $0.3144
======================================================================

JSON Export

# Export statistics as JSON
python3 scripts/thinking-budget-manager.py --stats --json > thinking-stats.json

# Use in automation
cat thinking-stats.json | jq '.total_cost'

Usage Data Location

File: .coditect/thinking-usage.json

Schema:

{
  "total_sessions": 125,
  "total_thinking_tokens": 2450000,
  "total_cost": 38.25,
  "by_level": { ... },
  "by_task": { ... },
  "by_agent": { ... },
  "sessions": [ ... ]
}

Cost Optimization Strategies

Strategy 1: Right-Size Default Budgets

Problem: All agents use medium budget by default

Solution:

# Documentation agent - reduce to low
thinking_config:
  default_level: low  # Save 75% on routine docs

# QA agent - increase to high
thinking_config:
  default_level: high  # Ensure thorough reviews

Impact: 40-60% cost reduction on routine tasks

Strategy 2: Set Maximum Budgets

Problem: Agents occasionally use ultrathink when not needed

Solution:

# Prevent doc agent from expensive thinking
thinking_config:
  max_level: medium  # Cap at $0.07/session

Impact: Prevents unexpected cost spikes

Strategy 3: Enable Adaptive Budgets

Problem: Variable workload complexity

Solution:

# Auto-adjust based on task complexity
thinking_config:
  adaptive: true  # Simple tasks → lower budgets

Impact: 20-30% cost savings on mixed workloads

Strategy 4: Set Quality Floors

Problem: Critical agents under-thinking

Solution:

# Security agent must use minimum high budget
thinking_config:
  min_level: high  # Never compromise security

Impact: Maintains quality on critical tasks

Strategy 5: Monitor and Adjust

Process:

Enable tracking: track_usage: true
Review weekly: --stats
Identify inefficiencies
Adjust agent configs
Measure impact

Impact: Continuous 10-15% improvement over time

Integration Examples

Example 1: Budget Selection

from pathlib import Path
from thinking_budget_manager import ThinkingBudgetManager, ComplexityLevel

# Initialize manager
manager = ThinkingBudgetManager(Path.cwd())

# Get budget for task
level, tokens = manager.get_thinking_budget(
    task_type="code_review",
    complexity=ComplexityLevel.STANDARD,
    agent_name="qa-specialist"
)

print(f"Use {level} budget: {tokens:,} tokens")
# Output: Use high budget: 16,384 tokens

Example 2: Cost Estimation

# Estimate task cost
estimates = manager.estimate_task_cost(
    "comprehensive security audit of authentication system"
)

print(f"Recommended: {estimates['recommended_level']}")
print(f"Cost: ${estimates['recommended_cost']:.4f}")
# Output:
# Recommended: ultrathink
# Cost: $0.9890

Example 3: Usage Tracking

# Track actual usage
manager.track_usage(
    task_type="code_review",
    budget_level="high",
    tokens_used=12500,
    agent_name="qa-specialist"
)

# Get statistics
stats = manager.get_stats()
print(f"Total cost: ${stats['total_cost']:.2f}")

Example 4: Agent Orchestration

def execute_agent_task(agent_name: str, task: str, description: str):
    """Execute agent task with optimal thinking budget"""
    manager = ThinkingBudgetManager(Path.cwd())

    # Get recommended budget
    level, tokens = manager.get_thinking_budget(
        task_type=task,
        complexity=detect_complexity(description),
        agent_name=agent_name
    )

    # Execute with budget
    result = run_agent(
        agent=agent_name,
        task=description,
        thinking_budget=tokens
    )

    # Track usage
    manager.track_usage(
        task_type=task,
        budget_level=level,
        tokens_used=result.thinking_tokens,
        agent_name=agent_name
    )

    return result

CLI Reference

Get Budget Recommendation

python3 scripts/thinking-budget-manager.py \
  --task TASK_TYPE \
  --complexity {simple|standard|complex|critical} \
  [--agent AGENT_NAME] \
  [--json]

Estimate Task Cost

python3 scripts/thinking-budget-manager.py \
  --estimate "task description" \
  [--json]

Track Usage

python3 scripts/thinking-budget-manager.py --track \
  --task TASK_TYPE \
  --budget-level LEVEL \
  --tokens-used COUNT \
  [--agent AGENT_NAME]

View Statistics

python3 scripts/thinking-budget-manager.py --stats [--json]

List Options

# List all task types
python3 scripts/thinking-budget-manager.py --list-tasks

# List all budget levels
python3 scripts/thinking-budget-manager.py --list-levels

Best Practices

1. Start with Defaults

Use recommended defaults from configuration:

# Check default for task type
python3 scripts/thinking-budget-manager.py --list-tasks | grep code_review
# Output: code_review → high

2. Adjust Based on Data

Enable tracking and review regularly:

# Weekly review
python3 scripts/thinking-budget-manager.py --stats

# Look for:
# - Agents consistently under/over budget
# - High-cost tasks that could be optimized
# - Budget efficiency opportunities

3. Set Appropriate Constraints

Configure agents with realistic limits:

# Routine agent
thinking_config:
  default_level: low
  max_level: medium  # Prevent cost spikes

# Critical agent
thinking_config:
  default_level: high
  min_level: medium  # Ensure quality

4. Use Complexity Markers

Explicitly specify complexity when known:

# Critical production issue
--complexity critical  # Forces higher budget

# Simple typo fix
--complexity simple  # Saves cost

5. Monitor Cost Trends

Track costs over time:

# Export monthly
python3 scripts/thinking-budget-manager.py --stats --json > monthly-stats.json

# Compare month-over-month
# Identify optimization opportunities

Troubleshooting

Issue: High Costs

Symptoms:

Total cost growing unexpectedly
Agents using ultrathink frequently

Solutions:

Review top cost drivers: --stats
Set max_level constraints on agents
Enable adaptive budgets for variable workloads
Check for tasks incorrectly classified as complex

Issue: Low Quality

Symptoms:

Agent outputs insufficient
Reviews missing issues
Designs incomplete

Solutions:

Increase default_level for agent
Set min_level quality floor
Verify task complexity detection
Disable adaptive: false for consistent quality

Issue: Tracking Not Working

Symptoms:

Usage file not updating
Statistics empty or incorrect

Solutions:

Verify .coditect/ directory exists
Check file permissions on thinking-usage.json
Ensure track_usage: true in agent config
Verify tracking command includes all required parameters

Configuration Files

Main Config: config/thinking-budgets.json

Budget level definitions
Task-to-budget mappings
Cost estimation parameters
Agent overrides

Usage Data: .coditect/thinking-usage.json

Session history
Aggregate statistics
Cost tracking
Agent usage patterns

Last Updated: 2025-12-22 Compliance: CODITECT Documentation Standard v1.0.0 Version: 1.0.0

Overview​

Quick Start​

1. Get Budget Recommendation​

2. Estimate Task Cost​

3. Track Usage​

4. View Statistics​

Budget Levels​

Task-to-Budget Mapping​

Ultrathink Tasks (65,536 tokens)​

High Tasks (16,384 tokens)​

Medium Tasks (4,096 tokens)​

Low Tasks (1,024 tokens)​

None Tasks (no extended thinking)​

Complexity Adjustments​

Agent Configuration​

Usage Analytics​

View Overall Statistics​

JSON Export​

Usage Data Location​

Cost Optimization Strategies​

Strategy 1: Right-Size Default Budgets​

Strategy 2: Set Maximum Budgets​

Strategy 3: Enable Adaptive Budgets​

Strategy 4: Set Quality Floors​

Strategy 5: Monitor and Adjust​

Integration Examples​

Example 1: Budget Selection​

Example 2: Cost Estimation​

Example 3: Usage Tracking​

Example 4: Agent Orchestration​

CLI Reference​

Get Budget Recommendation​

Estimate Task Cost​

Track Usage​

View Statistics​

List Options​

Best Practices​

1. Start with Defaults​

2. Adjust Based on Data​

3. Set Appropriate Constraints​

4. Use Complexity Markers​

5. Monitor Cost Trends​

Troubleshooting​

Issue: High Costs​

Issue: Low Quality​

Issue: Tracking Not Working​

Related Documentation​

Configuration Files​

Overview

Quick Start

1. Get Budget Recommendation

2. Estimate Task Cost

3. Track Usage

4. View Statistics

Budget Levels

Task-to-Budget Mapping

Ultrathink Tasks (65,536 tokens)

High Tasks (16,384 tokens)

Medium Tasks (4,096 tokens)

Low Tasks (1,024 tokens)

None Tasks (no extended thinking)

Complexity Adjustments

Agent Configuration

Usage Analytics

View Overall Statistics

JSON Export

Usage Data Location

Cost Optimization Strategies

Strategy 1: Right-Size Default Budgets

Strategy 2: Set Maximum Budgets

Strategy 3: Enable Adaptive Budgets

Strategy 4: Set Quality Floors

Strategy 5: Monitor and Adjust

Integration Examples

Example 1: Budget Selection

Example 2: Cost Estimation

Example 3: Usage Tracking

Example 4: Agent Orchestration

CLI Reference

Get Budget Recommendation

Estimate Task Cost

Track Usage

View Statistics

List Options

Best Practices

1. Start with Defaults

2. Adjust Based on Data

3. Set Appropriate Constraints

4. Use Complexity Markers

5. Monitor Cost Trends

Troubleshooting

Issue: High Costs

Issue: Low Quality

Issue: Tracking Not Working

Related Documentation

Configuration Files