Token Cost Tracking Skill
Token Cost Tracking Skill
How to Use This Skill
- Review the patterns and examples below
- Apply the relevant patterns to your implementation
- Follow the best practices outlined in this skill
Track token usage and costs across different LLM models to optimize AI operations and budget planning.
When to Use This Skill
✅ Use this skill when:
- Starting/ending a session (log token usage)
- Comparing efficiency of different approaches (skill vs manual)
- Monthly cost reporting for AI operations
- Identifying token-heavy operations for optimization
- Validating skill vs manual workflow savings
- Budget planning for multi-model AI usage
- Proven need: Monitor 200K token sessions
❌ Don't use this skill when:
- Using free local models only (LM Studio - no cost tracking needed)
- Session < 10K tokens (overhead not worth tracking)
- One-off experiments (not enough data for trends)
- Token cost not a concern for the project
What It Provides
Token Tracking:
- Input tokens (prompt + context)
- Output tokens (generated responses)
- Total tokens per session/task
- Model-specific pricing
Cost Calculation:
Claude Sonnet 4: $3/1M input, $15/1M output
Claude Opus: $15/1M input, $75/1M output
GPT-4 Turbo: $10/1M input, $30/1M output
GPT-4o: $5/1M input, $15/1M output
Gemini 1.5 Pro: $1.25/1M input, $5/1M output
Reporting:
- Per-session costs
- Daily/weekly/monthly summaries
- Model usage breakdown
- Cost trends over time
Usage
Log Session Usage
cd .claude/skills/token-cost-tracking
# Log current session
./core/track-tokens.py log \
--session "Sprint 2 Build #19" \
--model "claude-sonnet-4" \
--input-tokens 50000 \
--output-tokens 15000 \
--notes "Build + deploy + documentation"
Generate Cost Report
# Today's usage
./core/track-tokens.py report --period today
# This week
./core/track-tokens.py report --period week
# This month
./core/track-tokens.py report --period month
# Specific model
./core/track-tokens.py report --model claude-sonnet-4 --period month
Compare Approaches
# Skill vs manual
./core/track-tokens.py compare \
--approach-1 "Manual git commit" \
--tokens-1 10000 \
--approach-2 "git-workflow-automation skill" \
--tokens-2 2500
Export Data
# Export to CSV
./core/track-tokens.py export --output token-usage-october.csv
# View summary
./core/track-tokens.py summary
Model Pricing (as of Oct 2025)
| Model | Input ($/1M) | Output ($/1M) | Use Case |
|---|---|---|---|
| Claude Sonnet 4.5 | $3 | $15 | Production (balanced) |
| Claude Opus 3.5 | $15 | $75 | Complex reasoning |
| GPT-4 Turbo | $10 | $30 | General purpose |
| GPT-4o | $5 | $15 | Cost-effective |
| Gemini 1.5 Pro | $1.25 | $5 | High volume |
| LM Studio (local) | $0 | $0 | Free (hardware cost only) |
Data Storage
Log File: .coditect/token-usage.jsonl (JSON Lines format)
Example entry:
{
"timestamp": "2025-10-19T20:30:00Z",
"session": "Sprint 2 Build #19",
"model": "claude-sonnet-4",
"input_tokens": 50000,
"output_tokens": 15000,
"total_tokens": 65000,
"cost_usd": 0.375,
"notes": "Build + deploy + documentation"
}
Cost Calculation Formula
input_cost = (input_tokens / 1_000_000) * input_price_per_million
output_cost = (output_tokens / 1_000_000) * output_price_per_million
total_cost = input_cost + output_cost
Example Workflows
Session Logging
# At end of session
python3 ./core/track-tokens.py log \
--session "Feature implementation" \
--model claude-sonnet-4 \
--input-tokens 120000 \
--output-tokens 35000 \
--notes "User profile editing + tests"
# Output:
# ✓ Logged session: Feature implementation
# Total tokens: 155,000
# Cost: $0.885 USD
Monthly Report
python3 ./core/track-tokens.py report --period month
# Output:
# Token Usage Report - October 2025
#
# Total sessions: 45
# Total tokens: 3,250,000
# Total cost: $18.75 USD
#
# By model:
# Claude Sonnet 4: 2,500,000 tokens ($14.25)
# Claude Opus: 500,000 tokens ($3.75)
# GPT-4o: 250,000 tokens ($0.75)
#
# Top sessions:
# 1. Sprint 2 deployment (155K tokens, $0.885)
# 2. Architecture research (98K tokens, $0.564)
Skill Efficiency Analysis
# Compare manual vs skill approach
python3 ./core/track-tokens.py compare \
--approach-1 "Manual deployment" \
--tokens-1 45000 \
--approach-2 "build-deploy-workflow skill" \
--tokens-2 5000
# Output:
# Comparison Report
#
# Manual deployment:
# Tokens: 45,000
# Cost: $0.270 USD
#
# build-deploy-workflow skill:
# Tokens: 5,000
# Cost: $0.030 USD
#
# Savings: 40,000 tokens (89%), $0.240 USD (89%)
Integration with Skills
Track skill usage automatically:
# Before using skill
START_TOKENS=$(get current token count)
# Use skill
./core/deploy.sh --build-num=20 --changes="Feature X"
# After using skill
END_TOKENS=$(get current token count)
USED_TOKENS=$((END_TOKENS - START_TOKENS))
# Log usage
./token-cost-tracking/core/track-tokens.py log \
--session "Build #20" \
--model claude-sonnet-4 \
--input-tokens $USED_TOKENS \
--notes "build-deploy-workflow skill"
Metrics Tracked
-
Token Metrics:
- Total tokens used
- Input vs output ratio
- Tokens per session
- Tokens per task type
-
Cost Metrics:
- Total spend (daily/weekly/monthly)
- Cost per session
- Cost per model
- Cost trends over time
-
Efficiency Metrics:
- Skill usage vs manual (token savings)
- Model selection efficiency
- Token waste identification
Safety & Privacy
Data stored locally only:
- No external API calls
- No cloud storage
- Privacy-preserving (session names, notes are user-controlled)
- Can be git-ignored if sensitive
Troubleshooting
Error: "Log file not found"
- Fix: File created automatically on first use
- Location:
.coditect/token-usage.jsonl
Error: "Invalid model"
- Check: Model name matches pricing table
- Supported: claude-sonnet-4, claude-opus, gpt-4-turbo, gpt-4o, gemini-1.5-pro
Cost seems wrong
- Verify: Token counts are accurate
- Check: Using correct model pricing
- Update: Prices change periodically, update script
Multi-Context Window Support
This skill supports long-running token tracking tasks across multiple context windows using Claude 4.5's enhanced state management capabilities.
State Tracking
Checkpoint State (JSON):
{
"checkpoint_id": "token_tracking_20251129_151500",
"tracking_period": "november_2025",
"sessions_logged": [
{"session": "Sprint 10 Build", "tokens": 65000, "cost_usd": 0.375, "status": "complete"},
{"session": "Feature planning", "tokens": 28000, "cost_usd": 0.162, "status": "complete"},
{"session": "Documentation update", "tokens": 12000, "cost_usd": 0.069, "status": "in_progress"}
],
"total_sessions": 15,
"total_tokens": 450000,
"total_cost_usd": 2.625,
"model_distribution": {
"claude-sonnet-4": 380000,
"gpt-4o": 70000
},
"token_usage": 2500,
"created_at": "2025-11-29T15:15:00Z"
}
Progress Notes (Markdown):
# Token Cost Tracking Progress - 2025-11-29
## November 2025 Summary
- Total sessions tracked: 15
- Total tokens: 450,000
- Total cost: $2.625 USD
- Average per session: 30,000 tokens ($0.175)
## Model Distribution
- Claude Sonnet 4: 380,000 tokens (84%) - $2.16
- GPT-4o: 70,000 tokens (16%) - $0.465
## Top Cost Sessions
1. Sprint 10 Build: 65,000 tokens ($0.375)
2. Architecture design: 52,000 tokens ($0.30)
3. Multi-module refactor: 48,000 tokens ($0.277)
## Efficiency Insights
- Skills usage vs manual: 40% token savings identified
- High-value sessions (>50K): 3 (concentrate on these)
- Low-value sessions (<10K): 4 (consider batching)
## Next Actions
- Log remaining 2 sessions for November
- Generate monthly cost report
- Compare Oct vs Nov trends
- Identify optimization opportunities
Session Recovery
When starting a fresh context window after token tracking work:
- Load Checkpoint State: Read
.coditect/checkpoints/token-tracking-latest.json - Review Progress Notes: Check
token-tracking-progress.mdfor monthly status - Verify Log File: Use Read to confirm
.coditect/token-usage.jsonlentries - Check Totals: Calculate cumulative tokens and costs
- Resume Logging: Continue from last logged session
Recovery Commands:
# 1. Check latest token tracking checkpoint
cat .coditect/checkpoints/token-tracking-latest.json | jq '.sessions_logged'
# 2. Review progress notes
tail -30 token-tracking-progress.md
# 3. Verify log file
tail -10 .coditect/token-usage.jsonl | jq '.'
# 4. Calculate monthly totals
cat .coditect/token-usage.jsonl | jq -s 'map(select(.timestamp | startswith("2025-11"))) | map(.total_tokens) | add'
# 5. Check cost by model
python3 .claude/skills/token-cost-tracking/core/track-tokens.py report --period month
State Management Best Practices
Checkpoint Files (JSON Schema):
- Store in
.coditect/checkpoints/token-tracking-{timestamp}.json - Track sessions logged vs remaining with per-session metrics
- Record model distribution for cost optimization analysis
- Include cumulative totals for budget monitoring
- Document efficiency insights (skills vs manual comparisons)
Progress Tracking (Markdown Narrative):
- Maintain
token-tracking-progress.mdwith monthly summaries - Document high-cost sessions and their justification
- Note efficiency patterns (which workflows save most tokens)
- List budget alerts and threshold warnings
- Track month-over-month trends
Git Integration:
- Create checkpoint after logging batch of sessions
- Commit token-usage.jsonl with monthly summary tags
- Use conventional commits:
docs(tokens): Add November 2025 tracking data - Tag monthly reports:
git tag token-report-2025-11
Progress Checkpoints
Natural Breaking Points:
- After logging each session (5+ sessions = checkpoint)
- After generating monthly/weekly reports
- After identifying optimization opportunities
- After comparing period-over-period trends
- After budget threshold alerts
Checkpoint Creation Pattern:
# Automatic checkpoint creation at critical phases
if sessions_logged > 5 or total_cost_usd > 1.0:
create_checkpoint({
"sessions": session_list,
"totals": {
"tokens": total_tokens,
"cost_usd": total_cost
},
"distribution": model_distribution,
"tokens": current_token_usage
})
Example: Multi-Context Token Tracking
Context Window 1: Log First 10 Sessions
{
"checkpoint_id": "token_tracking_batch1",
"phase": "initial_logging",
"sessions_logged": 10,
"total_tokens": 320000,
"total_cost_usd": 1.875,
"next_action": "Log remaining 5 sessions for November",
"token_usage": 2000
}
Context Window 2: Remaining Sessions + Monthly Report
# Resume from checkpoint
cat .coditect/checkpoints/token_tracking_batch1.json
# Continue logging
# (Context restored in <1 minute vs 8 minutes from scratch)
# Complete logging and generate report
{
"checkpoint_id": "token_tracking_november_complete",
"phase": "monthly_complete",
"total_sessions": 15,
"total_tokens": 450000,
"total_cost_usd": 2.625,
"report_generated": true,
"token_usage": 1500
}
Token Savings: 2000 (first context) + 1500 (second context) = 3500 total vs. 6000 without checkpoint = 42% reduction
See docs/CLAUDE-4.5-BEST-PRACTICES.md for complete multi-context patterns.
Success Output
When this skill completes successfully, output:
✅ SKILL COMPLETE: token-cost-tracking
Completed:
- [x] Token usage logged for session
- [x] Cost calculated based on model pricing
- [x] Report generated (daily/weekly/monthly)
- [x] Efficiency comparison completed (skill vs manual)
- [x] Data exported to CSV/JSON
Outputs:
- Token log entry: .coditect/token-usage.jsonl
- Cost report: Total 450,000 tokens, $2.625 USD (November 2025)
- Model distribution: Claude Sonnet 4 (84%), GPT-4o (16%)
- Efficiency insights: Skills usage saves 40% tokens vs manual
- Top sessions identified: Sprint 10 Build (65K tokens, $0.375)
Completion Checklist
Before marking this skill as complete, verify:
- Session logged with accurate token counts (input + output)
- Model name matches pricing table (claude-sonnet-4, gpt-4o, etc.)
- Cost calculated correctly using formula: (tokens/1M) × price
- Log entry written to
.coditect/token-usage.jsonl - Report generated for requested period (today/week/month)
- Efficiency comparison includes skill vs manual token savings
- Checkpoint created if tracking multiple sessions
- Data exportable to CSV for external analysis
Failure Indicators
This skill has FAILED if:
- ❌ Token counts are zero or obviously incorrect
- ❌ Model name not recognized (invalid pricing lookup)
- ❌ Cost calculation incorrect (wrong formula or pricing)
- ❌ Log file not created or not appended to
- ❌ Report shows no data for requested period
- ❌ Efficiency comparison missing or incomplete
- ❌ Exported CSV/JSON malformed or empty
- ❌ Checkpoint not created after 5+ sessions logged
When NOT to Use
Do NOT use this skill when:
- Using free local models only (LM Studio - no cost to track)
- Session < 10K tokens (overhead not worth tracking)
- One-off experiments (not enough data for meaningful trends)
- Token cost is not a budget concern for the project
- Already using external token tracking service (e.g., LangSmith)
- Working with non-LLM operations (no tokens to track)
- Need real-time token streaming (this is batch/session logging)
Use alternatives:
- LangSmith - For real-time token streaming and tracing
- OpenAI Usage Dashboard - For OpenAI-specific tracking
- Cloud billing - For infrastructure cost tracking
- Manual spreadsheet - For simple one-time tracking
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Logging after every message | Too granular, excessive overhead | Log once per session (end of session) |
| Not tracking input separately | Cannot analyze input/output ratio | Track input_tokens and output_tokens separately |
| Hardcoded pricing | Outdated when providers change pricing | Update pricing table quarterly, use config file |
| Missing session context | Cannot identify high-cost operations | Always include session name and notes |
| No model specification | Cannot calculate cost correctly | Always specify model (claude-sonnet-4, etc.) |
| Ignoring checkpoint data | Lose progress if context clears | Create checkpoint after 5+ sessions logged |
| Not exporting data | Cannot analyze trends externally | Export to CSV monthly for trend analysis |
| Tracking only total tokens | Cannot optimize input vs output | Track input/output separately for optimization insights |
Principles
This skill embodies these CODITECT principles:
- #1 Measure to Optimize - Track token usage to identify optimization opportunities
- #2 Data-Driven Decisions - Use cost data to choose appropriate models and approaches
- #3 Transparency - Clear cost breakdown by model, session, and time period
- #5 Eliminate Ambiguity - Exact token counts and cost calculations, no estimates
- #6 Clear, Understandable - Reports show costs in familiar units (USD, tokens)
- #7 Continuous Improvement - Monthly trends reveal efficiency gains over time
- #8 No Assumptions - Verify actual token usage, don't estimate
Reference: CODITECT-STANDARD-AUTOMATION.md
See Also
- multi-agent-workflow - Token budget management for complex workflows
- evaluation-framework - LLM-as-judge cost tracking
- Anthropic Pricing: https://www.anthropic.com/pricing
- OpenAI Pricing: https://openai.com/api/pricing