Skip to main content

Thinking Budget System

Status: Production Version: 1.0.0 Created: 2025-12-22 Category: Cost Optimization


Overview

The Thinking Budget System provides intelligent management of extended thinking token allocation for AI agents. It optimizes costs while maintaining quality by automatically selecting appropriate thinking budgets based on task type, complexity, and agent capabilities.

Key Benefits:

  • 📉 Cost Reduction - Up to 75% cost savings on routine tasks
  • 🎯 Quality Assurance - Ensures adequate thinking for critical tasks
  • 📊 Usage Analytics - Comprehensive tracking and optimization insights
  • ⚙️ Adaptive Budgets - Automatic adjustment based on task complexity
  • 🤖 Agent-Aware - Respects per-agent budget constraints

Quick Start

1. Get Budget Recommendation

# For a specific task
python3 scripts/thinking-budget-manager.py \
--task code_review \
--complexity standard

# Output:
# Budget Level: high
# Tokens: 16,384
# Estimated Cost: $0.2518

2. Estimate Task Cost

# Natural language task description
python3 scripts/thinking-budget-manager.py \
--estimate "comprehensive security audit of authentication system"

# Output:
# Detected Type: security_audit
# Detected Complexity: complex
# Recommended Level: ultrathink
# Estimated Cost: $0.9890

3. Track Usage

# After agent execution
python3 scripts/thinking-budget-manager.py --track \
--task code_review \
--budget-level high \
--tokens-used 12500 \
--agent qa-specialist

4. View Statistics

# Show comprehensive usage stats
python3 scripts/thinking-budget-manager.py --stats

# JSON output for automation
python3 scripts/thinking-budget-manager.py --stats --json

Budget Levels

The system provides five budget levels optimized for different task types:

LevelTokensUse CasesCost/SessionCost Multiplier
ultrathink65,536Architecture design, strategic planning, complex system analysis$0.994.0x
high16,384QA review, security audit, critical analysis$0.252.0x
medium4,096Feature implementation, bug fixes, refactoring$0.071.0x
low1,024Formatting, documentation, simple edits$0.020.25x
nonenullInformation retrieval, status checks, file reads$0.000.0x

Cost Calculation:

  • Base model: Claude Sonnet 4.5
  • Thinking tokens: $0.015 per 1K
  • Output tokens: $0.003 per 1K (estimated 2K output)

Task-to-Budget Mapping

The system automatically maps task types to appropriate budget levels:

Ultrathink Tasks (65,536 tokens)

When to use:

  • Multi-system architecture design
  • Strategic planning and roadmaps
  • Complex algorithm design
  • Security architecture review
  • Cross-component dependency analysis

Task types:

  • architecture_review
  • spec_creation
  • system_design
  • strategic_planning
  • dependency_analysis

High Tasks (16,384 tokens)

When to use:

  • Code quality review
  • Security audit
  • Design pattern selection
  • Performance optimization analysis
  • Integration planning

Task types:

  • code_review
  • qa_review
  • security_audit
  • performance_analysis
  • design_decision

Medium Tasks (4,096 tokens)

When to use:

  • Feature implementation
  • Bug fixes
  • Code refactoring
  • API design
  • Test writing

Task types:

  • feature_implementation
  • bug_fix
  • refactoring
  • api_design
  • test_writing

Low Tasks (1,024 tokens)

When to use:

  • Code formatting
  • Documentation updates
  • Simple edits
  • Configuration changes
  • Typo fixes

Task types:

  • formatting
  • documentation
  • simple_edit
  • config_change
  • typo_fix

None Tasks (no extended thinking)

When to use:

  • Information retrieval
  • Status checks
  • Simple queries
  • File reads

Task types:

  • info_retrieval
  • status_check
  • file_read

Complexity Adjustments

The system automatically adjusts budgets based on task complexity:

ComplexityAdjustmentExample
critical+2 levelsProduction security issue → ultrathink
complex+1 levelMulti-file refactoring → high
standardNo changeNormal bug fix → medium
simple-1 levelSingle-line fix → low

Example:

# Standard bug fix → medium (4,096 tokens)
python3 scripts/thinking-budget-manager.py --task bug_fix --complexity standard

# Complex bug fix → high (16,384 tokens)
python3 scripts/thinking-budget-manager.py --task bug_fix --complexity complex

# Simple bug fix → low (1,024 tokens)
python3 scripts/thinking-budget-manager.py --task bug_fix --complexity simple

Agent Configuration

Agents can specify thinking budget preferences in their frontmatter:

---
name: qa-specialist
description: Quality assurance expert
tools: Read, Grep, Bash
model: sonnet

thinking_config:
default_level: high # Default budget
max_level: ultrathink # Maximum allowed
min_level: medium # Quality floor
adaptive: true # Enable auto-adjustment
track_usage: true # Enable analytics
---

Configuration fields:

  • default_level - Default budget for this agent's tasks
  • max_level - Maximum budget (cost control)
  • min_level - Minimum budget (quality assurance)
  • adaptive - Enable/disable complexity-based adjustment
  • track_usage - Enable/disable usage tracking

See Agent Frontmatter: Thinking Configuration for complete documentation.

Usage Analytics

View Overall Statistics

python3 scripts/thinking-budget-manager.py --stats

Output:

======================================================================
THINKING BUDGET USAGE STATISTICS
======================================================================

📊 Overall Statistics:
Total Sessions: 125
Total Thinking Tokens: 2,450,000
Total Cost: $38.25
Avg Tokens/Session: 19,600
Avg Cost/Session: $0.31

📈 By Budget Level:
ultrathink - 12 sessions, 786,432 tokens, $11.8645
high - 45 sessions, 737,280 tokens, $11.1412
medium - 52 sessions, 213,248 tokens, $3.3237
low - 16 sessions, 16,384 tokens, $0.2518

🎯 Top Tasks by Usage:
code_review - 45 sessions, 737,280 tokens, $11.1412
feature_implementation - 35 sessions, 143,360 tokens, $2.2354
architecture_review - 12 sessions, 786,432 tokens, $11.8645
documentation - 20 sessions, 20,480 tokens, $0.3144

🤖 By Agent:
qa-specialist - 45 sessions, 737,280 tokens, $11.1412
backend-developer - 35 sessions, 143,360 tokens, $2.2354
system-architect - 12 sessions, 786,432 tokens, $11.8645
doc-writer - 20 sessions, 20,480 tokens, $0.3144
======================================================================

JSON Export

# Export statistics as JSON
python3 scripts/thinking-budget-manager.py --stats --json > thinking-stats.json

# Use in automation
cat thinking-stats.json | jq '.total_cost'

Usage Data Location

File: .coditect/thinking-usage.json

Schema:

{
"total_sessions": 125,
"total_thinking_tokens": 2450000,
"total_cost": 38.25,
"by_level": { ... },
"by_task": { ... },
"by_agent": { ... },
"sessions": [ ... ]
}

Cost Optimization Strategies

Strategy 1: Right-Size Default Budgets

Problem: All agents use medium budget by default

Solution:

# Documentation agent - reduce to low
thinking_config:
default_level: low # Save 75% on routine docs

# QA agent - increase to high
thinking_config:
default_level: high # Ensure thorough reviews

Impact: 40-60% cost reduction on routine tasks

Strategy 2: Set Maximum Budgets

Problem: Agents occasionally use ultrathink when not needed

Solution:

# Prevent doc agent from expensive thinking
thinking_config:
max_level: medium # Cap at $0.07/session

Impact: Prevents unexpected cost spikes

Strategy 3: Enable Adaptive Budgets

Problem: Variable workload complexity

Solution:

# Auto-adjust based on task complexity
thinking_config:
adaptive: true # Simple tasks → lower budgets

Impact: 20-30% cost savings on mixed workloads

Strategy 4: Set Quality Floors

Problem: Critical agents under-thinking

Solution:

# Security agent must use minimum high budget
thinking_config:
min_level: high # Never compromise security

Impact: Maintains quality on critical tasks

Strategy 5: Monitor and Adjust

Process:

  1. Enable tracking: track_usage: true
  2. Review weekly: --stats
  3. Identify inefficiencies
  4. Adjust agent configs
  5. Measure impact

Impact: Continuous 10-15% improvement over time

Integration Examples

Example 1: Budget Selection

from pathlib import Path
from thinking_budget_manager import ThinkingBudgetManager, ComplexityLevel

# Initialize manager
manager = ThinkingBudgetManager(Path.cwd())

# Get budget for task
level, tokens = manager.get_thinking_budget(
task_type="code_review",
complexity=ComplexityLevel.STANDARD,
agent_name="qa-specialist"
)

print(f"Use {level} budget: {tokens:,} tokens")
# Output: Use high budget: 16,384 tokens

Example 2: Cost Estimation

# Estimate task cost
estimates = manager.estimate_task_cost(
"comprehensive security audit of authentication system"
)

print(f"Recommended: {estimates['recommended_level']}")
print(f"Cost: ${estimates['recommended_cost']:.4f}")
# Output:
# Recommended: ultrathink
# Cost: $0.9890

Example 3: Usage Tracking

# Track actual usage
manager.track_usage(
task_type="code_review",
budget_level="high",
tokens_used=12500,
agent_name="qa-specialist"
)

# Get statistics
stats = manager.get_stats()
print(f"Total cost: ${stats['total_cost']:.2f}")

Example 4: Agent Orchestration

def execute_agent_task(agent_name: str, task: str, description: str):
"""Execute agent task with optimal thinking budget"""
manager = ThinkingBudgetManager(Path.cwd())

# Get recommended budget
level, tokens = manager.get_thinking_budget(
task_type=task,
complexity=detect_complexity(description),
agent_name=agent_name
)

# Execute with budget
result = run_agent(
agent=agent_name,
task=description,
thinking_budget=tokens
)

# Track usage
manager.track_usage(
task_type=task,
budget_level=level,
tokens_used=result.thinking_tokens,
agent_name=agent_name
)

return result

CLI Reference

Get Budget Recommendation

python3 scripts/thinking-budget-manager.py \
--task TASK_TYPE \
--complexity {simple|standard|complex|critical} \
[--agent AGENT_NAME] \
[--json]

Estimate Task Cost

python3 scripts/thinking-budget-manager.py \
--estimate "task description" \
[--json]

Track Usage

python3 scripts/thinking-budget-manager.py --track \
--task TASK_TYPE \
--budget-level LEVEL \
--tokens-used COUNT \
[--agent AGENT_NAME]

View Statistics

python3 scripts/thinking-budget-manager.py --stats [--json]

List Options

# List all task types
python3 scripts/thinking-budget-manager.py --list-tasks

# List all budget levels
python3 scripts/thinking-budget-manager.py --list-levels

Best Practices

1. Start with Defaults

Use recommended defaults from configuration:

# Check default for task type
python3 scripts/thinking-budget-manager.py --list-tasks | grep code_review
# Output: code_review → high

2. Adjust Based on Data

Enable tracking and review regularly:

# Weekly review
python3 scripts/thinking-budget-manager.py --stats

# Look for:
# - Agents consistently under/over budget
# - High-cost tasks that could be optimized
# - Budget efficiency opportunities

3. Set Appropriate Constraints

Configure agents with realistic limits:

# Routine agent
thinking_config:
default_level: low
max_level: medium # Prevent cost spikes

# Critical agent
thinking_config:
default_level: high
min_level: medium # Ensure quality

4. Use Complexity Markers

Explicitly specify complexity when known:

# Critical production issue
--complexity critical # Forces higher budget

# Simple typo fix
--complexity simple # Saves cost

Track costs over time:

# Export monthly
python3 scripts/thinking-budget-manager.py --stats --json > monthly-stats.json

# Compare month-over-month
# Identify optimization opportunities

Troubleshooting

Issue: High Costs

Symptoms:

  • Total cost growing unexpectedly
  • Agents using ultrathink frequently

Solutions:

  1. Review top cost drivers: --stats
  2. Set max_level constraints on agents
  3. Enable adaptive budgets for variable workloads
  4. Check for tasks incorrectly classified as complex

Issue: Low Quality

Symptoms:

  • Agent outputs insufficient
  • Reviews missing issues
  • Designs incomplete

Solutions:

  1. Increase default_level for agent
  2. Set min_level quality floor
  3. Verify task complexity detection
  4. Disable adaptive: false for consistent quality

Issue: Tracking Not Working

Symptoms:

  • Usage file not updating
  • Statistics empty or incorrect

Solutions:

  1. Verify .coditect/ directory exists
  2. Check file permissions on thinking-usage.json
  3. Ensure track_usage: true in agent config
  4. Verify tracking command includes all required parameters

Configuration Files

Main Config: config/thinking-budgets.json

  • Budget level definitions
  • Task-to-budget mappings
  • Cost estimation parameters
  • Agent overrides

Usage Data: .coditect/thinking-usage.json

  • Session history
  • Aggregate statistics
  • Cost tracking
  • Agent usage patterns

Last Updated: 2025-12-22 Compliance: CODITECT Documentation Standard v1.0.0 Version: 1.0.0