Skip to main content

Token Cost Tracking Skill

Token Cost Tracking Skill

How to Use This Skill

  1. Review the patterns and examples below
  2. Apply the relevant patterns to your implementation
  3. Follow the best practices outlined in this skill

Track token usage and costs across different LLM models to optimize AI operations and budget planning.

When to Use This Skill

Use this skill when:

  • Starting/ending a session (log token usage)
  • Comparing efficiency of different approaches (skill vs manual)
  • Monthly cost reporting for AI operations
  • Identifying token-heavy operations for optimization
  • Validating skill vs manual workflow savings
  • Budget planning for multi-model AI usage
  • Proven need: Monitor 200K token sessions

Don't use this skill when:

  • Using free local models only (LM Studio - no cost tracking needed)
  • Session < 10K tokens (overhead not worth tracking)
  • One-off experiments (not enough data for trends)
  • Token cost not a concern for the project

What It Provides

Token Tracking:

  • Input tokens (prompt + context)
  • Output tokens (generated responses)
  • Total tokens per session/task
  • Model-specific pricing

Cost Calculation:

Claude Sonnet 4: $3/1M input, $15/1M output
Claude Opus: $15/1M input, $75/1M output
GPT-4 Turbo: $10/1M input, $30/1M output
GPT-4o: $5/1M input, $15/1M output
Gemini 1.5 Pro: $1.25/1M input, $5/1M output

Reporting:

  • Per-session costs
  • Daily/weekly/monthly summaries
  • Model usage breakdown
  • Cost trends over time

Usage

Log Session Usage

cd .claude/skills/token-cost-tracking

# Log current session
./core/track-tokens.py log \
--session "Sprint 2 Build #19" \
--model "claude-sonnet-4" \
--input-tokens 50000 \
--output-tokens 15000 \
--notes "Build + deploy + documentation"

Generate Cost Report

# Today's usage
./core/track-tokens.py report --period today

# This week
./core/track-tokens.py report --period week

# This month
./core/track-tokens.py report --period month

# Specific model
./core/track-tokens.py report --model claude-sonnet-4 --period month

Compare Approaches

# Skill vs manual
./core/track-tokens.py compare \
--approach-1 "Manual git commit" \
--tokens-1 10000 \
--approach-2 "git-workflow-automation skill" \
--tokens-2 2500

Export Data

# Export to CSV
./core/track-tokens.py export --output token-usage-october.csv

# View summary
./core/track-tokens.py summary

Model Pricing (as of Oct 2025)

ModelInput ($/1M)Output ($/1M)Use Case
Claude Sonnet 4.5$3$15Production (balanced)
Claude Opus 3.5$15$75Complex reasoning
GPT-4 Turbo$10$30General purpose
GPT-4o$5$15Cost-effective
Gemini 1.5 Pro$1.25$5High volume
LM Studio (local)$0$0Free (hardware cost only)

Data Storage

Log File: .coditect/token-usage.jsonl (JSON Lines format)

Example entry:

{
"timestamp": "2025-10-19T20:30:00Z",
"session": "Sprint 2 Build #19",
"model": "claude-sonnet-4",
"input_tokens": 50000,
"output_tokens": 15000,
"total_tokens": 65000,
"cost_usd": 0.375,
"notes": "Build + deploy + documentation"
}

Cost Calculation Formula

input_cost = (input_tokens / 1_000_000) * input_price_per_million
output_cost = (output_tokens / 1_000_000) * output_price_per_million
total_cost = input_cost + output_cost

Example Workflows

Session Logging

# At end of session
python3 ./core/track-tokens.py log \
--session "Feature implementation" \
--model claude-sonnet-4 \
--input-tokens 120000 \
--output-tokens 35000 \
--notes "User profile editing + tests"

# Output:
# ✓ Logged session: Feature implementation
# Total tokens: 155,000
# Cost: $0.885 USD

Monthly Report

python3 ./core/track-tokens.py report --period month

# Output:
# Token Usage Report - October 2025
#
# Total sessions: 45
# Total tokens: 3,250,000
# Total cost: $18.75 USD
#
# By model:
# Claude Sonnet 4: 2,500,000 tokens ($14.25)
# Claude Opus: 500,000 tokens ($3.75)
# GPT-4o: 250,000 tokens ($0.75)
#
# Top sessions:
# 1. Sprint 2 deployment (155K tokens, $0.885)
# 2. Architecture research (98K tokens, $0.564)

Skill Efficiency Analysis

# Compare manual vs skill approach
python3 ./core/track-tokens.py compare \
--approach-1 "Manual deployment" \
--tokens-1 45000 \
--approach-2 "build-deploy-workflow skill" \
--tokens-2 5000

# Output:
# Comparison Report
#
# Manual deployment:
# Tokens: 45,000
# Cost: $0.270 USD
#
# build-deploy-workflow skill:
# Tokens: 5,000
# Cost: $0.030 USD
#
# Savings: 40,000 tokens (89%), $0.240 USD (89%)

Integration with Skills

Track skill usage automatically:

# Before using skill
START_TOKENS=$(get current token count)

# Use skill
./core/deploy.sh --build-num=20 --changes="Feature X"

# After using skill
END_TOKENS=$(get current token count)
USED_TOKENS=$((END_TOKENS - START_TOKENS))

# Log usage
./token-cost-tracking/core/track-tokens.py log \
--session "Build #20" \
--model claude-sonnet-4 \
--input-tokens $USED_TOKENS \
--notes "build-deploy-workflow skill"

Metrics Tracked

  1. Token Metrics:

    • Total tokens used
    • Input vs output ratio
    • Tokens per session
    • Tokens per task type
  2. Cost Metrics:

    • Total spend (daily/weekly/monthly)
    • Cost per session
    • Cost per model
    • Cost trends over time
  3. Efficiency Metrics:

    • Skill usage vs manual (token savings)
    • Model selection efficiency
    • Token waste identification

Safety & Privacy

Data stored locally only:

  • No external API calls
  • No cloud storage
  • Privacy-preserving (session names, notes are user-controlled)
  • Can be git-ignored if sensitive

Troubleshooting

Error: "Log file not found"

  • Fix: File created automatically on first use
  • Location: .coditect/token-usage.jsonl

Error: "Invalid model"

  • Check: Model name matches pricing table
  • Supported: claude-sonnet-4, claude-opus, gpt-4-turbo, gpt-4o, gemini-1.5-pro

Cost seems wrong

  • Verify: Token counts are accurate
  • Check: Using correct model pricing
  • Update: Prices change periodically, update script

Multi-Context Window Support

This skill supports long-running token tracking tasks across multiple context windows using Claude 4.5's enhanced state management capabilities.

State Tracking

Checkpoint State (JSON):

{
"checkpoint_id": "token_tracking_20251129_151500",
"tracking_period": "november_2025",
"sessions_logged": [
{"session": "Sprint 10 Build", "tokens": 65000, "cost_usd": 0.375, "status": "complete"},
{"session": "Feature planning", "tokens": 28000, "cost_usd": 0.162, "status": "complete"},
{"session": "Documentation update", "tokens": 12000, "cost_usd": 0.069, "status": "in_progress"}
],
"total_sessions": 15,
"total_tokens": 450000,
"total_cost_usd": 2.625,
"model_distribution": {
"claude-sonnet-4": 380000,
"gpt-4o": 70000
},
"token_usage": 2500,
"created_at": "2025-11-29T15:15:00Z"
}

Progress Notes (Markdown):

# Token Cost Tracking Progress - 2025-11-29

## November 2025 Summary
- Total sessions tracked: 15
- Total tokens: 450,000
- Total cost: $2.625 USD
- Average per session: 30,000 tokens ($0.175)

## Model Distribution
- Claude Sonnet 4: 380,000 tokens (84%) - $2.16
- GPT-4o: 70,000 tokens (16%) - $0.465

## Top Cost Sessions
1. Sprint 10 Build: 65,000 tokens ($0.375)
2. Architecture design: 52,000 tokens ($0.30)
3. Multi-module refactor: 48,000 tokens ($0.277)

## Efficiency Insights
- Skills usage vs manual: 40% token savings identified
- High-value sessions (>50K): 3 (concentrate on these)
- Low-value sessions (<10K): 4 (consider batching)

## Next Actions
- Log remaining 2 sessions for November
- Generate monthly cost report
- Compare Oct vs Nov trends
- Identify optimization opportunities

Session Recovery

When starting a fresh context window after token tracking work:

  1. Load Checkpoint State: Read .coditect/checkpoints/token-tracking-latest.json
  2. Review Progress Notes: Check token-tracking-progress.md for monthly status
  3. Verify Log File: Use Read to confirm .coditect/token-usage.jsonl entries
  4. Check Totals: Calculate cumulative tokens and costs
  5. Resume Logging: Continue from last logged session

Recovery Commands:

# 1. Check latest token tracking checkpoint
cat .coditect/checkpoints/token-tracking-latest.json | jq '.sessions_logged'

# 2. Review progress notes
tail -30 token-tracking-progress.md

# 3. Verify log file
tail -10 .coditect/token-usage.jsonl | jq '.'

# 4. Calculate monthly totals
cat .coditect/token-usage.jsonl | jq -s 'map(select(.timestamp | startswith("2025-11"))) | map(.total_tokens) | add'

# 5. Check cost by model
python3 .claude/skills/token-cost-tracking/core/track-tokens.py report --period month

State Management Best Practices

Checkpoint Files (JSON Schema):

  • Store in .coditect/checkpoints/token-tracking-{timestamp}.json
  • Track sessions logged vs remaining with per-session metrics
  • Record model distribution for cost optimization analysis
  • Include cumulative totals for budget monitoring
  • Document efficiency insights (skills vs manual comparisons)

Progress Tracking (Markdown Narrative):

  • Maintain token-tracking-progress.md with monthly summaries
  • Document high-cost sessions and their justification
  • Note efficiency patterns (which workflows save most tokens)
  • List budget alerts and threshold warnings
  • Track month-over-month trends

Git Integration:

  • Create checkpoint after logging batch of sessions
  • Commit token-usage.jsonl with monthly summary tags
  • Use conventional commits: docs(tokens): Add November 2025 tracking data
  • Tag monthly reports: git tag token-report-2025-11

Progress Checkpoints

Natural Breaking Points:

  1. After logging each session (5+ sessions = checkpoint)
  2. After generating monthly/weekly reports
  3. After identifying optimization opportunities
  4. After comparing period-over-period trends
  5. After budget threshold alerts

Checkpoint Creation Pattern:

# Automatic checkpoint creation at critical phases
if sessions_logged > 5 or total_cost_usd > 1.0:
create_checkpoint({
"sessions": session_list,
"totals": {
"tokens": total_tokens,
"cost_usd": total_cost
},
"distribution": model_distribution,
"tokens": current_token_usage
})

Example: Multi-Context Token Tracking

Context Window 1: Log First 10 Sessions

{
"checkpoint_id": "token_tracking_batch1",
"phase": "initial_logging",
"sessions_logged": 10,
"total_tokens": 320000,
"total_cost_usd": 1.875,
"next_action": "Log remaining 5 sessions for November",
"token_usage": 2000
}

Context Window 2: Remaining Sessions + Monthly Report

# Resume from checkpoint
cat .coditect/checkpoints/token_tracking_batch1.json

# Continue logging
# (Context restored in <1 minute vs 8 minutes from scratch)

# Complete logging and generate report
{
"checkpoint_id": "token_tracking_november_complete",
"phase": "monthly_complete",
"total_sessions": 15,
"total_tokens": 450000,
"total_cost_usd": 2.625,
"report_generated": true,
"token_usage": 1500
}

Token Savings: 2000 (first context) + 1500 (second context) = 3500 total vs. 6000 without checkpoint = 42% reduction

See docs/CLAUDE-4.5-BEST-PRACTICES.md for complete multi-context patterns.


Success Output

When this skill completes successfully, output:

✅ SKILL COMPLETE: token-cost-tracking

Completed:
- [x] Token usage logged for session
- [x] Cost calculated based on model pricing
- [x] Report generated (daily/weekly/monthly)
- [x] Efficiency comparison completed (skill vs manual)
- [x] Data exported to CSV/JSON

Outputs:
- Token log entry: .coditect/token-usage.jsonl
- Cost report: Total 450,000 tokens, $2.625 USD (November 2025)
- Model distribution: Claude Sonnet 4 (84%), GPT-4o (16%)
- Efficiency insights: Skills usage saves 40% tokens vs manual
- Top sessions identified: Sprint 10 Build (65K tokens, $0.375)

Completion Checklist

Before marking this skill as complete, verify:

  • Session logged with accurate token counts (input + output)
  • Model name matches pricing table (claude-sonnet-4, gpt-4o, etc.)
  • Cost calculated correctly using formula: (tokens/1M) × price
  • Log entry written to .coditect/token-usage.jsonl
  • Report generated for requested period (today/week/month)
  • Efficiency comparison includes skill vs manual token savings
  • Checkpoint created if tracking multiple sessions
  • Data exportable to CSV for external analysis

Failure Indicators

This skill has FAILED if:

  • ❌ Token counts are zero or obviously incorrect
  • ❌ Model name not recognized (invalid pricing lookup)
  • ❌ Cost calculation incorrect (wrong formula or pricing)
  • ❌ Log file not created or not appended to
  • ❌ Report shows no data for requested period
  • ❌ Efficiency comparison missing or incomplete
  • ❌ Exported CSV/JSON malformed or empty
  • ❌ Checkpoint not created after 5+ sessions logged

When NOT to Use

Do NOT use this skill when:

  • Using free local models only (LM Studio - no cost to track)
  • Session < 10K tokens (overhead not worth tracking)
  • One-off experiments (not enough data for meaningful trends)
  • Token cost is not a budget concern for the project
  • Already using external token tracking service (e.g., LangSmith)
  • Working with non-LLM operations (no tokens to track)
  • Need real-time token streaming (this is batch/session logging)

Use alternatives:

  • LangSmith - For real-time token streaming and tracing
  • OpenAI Usage Dashboard - For OpenAI-specific tracking
  • Cloud billing - For infrastructure cost tracking
  • Manual spreadsheet - For simple one-time tracking

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Logging after every messageToo granular, excessive overheadLog once per session (end of session)
Not tracking input separatelyCannot analyze input/output ratioTrack input_tokens and output_tokens separately
Hardcoded pricingOutdated when providers change pricingUpdate pricing table quarterly, use config file
Missing session contextCannot identify high-cost operationsAlways include session name and notes
No model specificationCannot calculate cost correctlyAlways specify model (claude-sonnet-4, etc.)
Ignoring checkpoint dataLose progress if context clearsCreate checkpoint after 5+ sessions logged
Not exporting dataCannot analyze trends externallyExport to CSV monthly for trend analysis
Tracking only total tokensCannot optimize input vs outputTrack input/output separately for optimization insights

Principles

This skill embodies these CODITECT principles:

  • #1 Measure to Optimize - Track token usage to identify optimization opportunities
  • #2 Data-Driven Decisions - Use cost data to choose appropriate models and approaches
  • #3 Transparency - Clear cost breakdown by model, session, and time period
  • #5 Eliminate Ambiguity - Exact token counts and cost calculations, no estimates
  • #6 Clear, Understandable - Reports show costs in familiar units (USD, tokens)
  • #7 Continuous Improvement - Monthly trends reveal efficiency gains over time
  • #8 No Assumptions - Verify actual token usage, don't estimate

Reference: CODITECT-STANDARD-AUTOMATION.md


See Also