Skip to main content

Skill Improvement Tracker

Continuously track, analyze, and improve skills based on real session outcomes. This skill implements the Continual Learning Loop where every session compounds into better future skills.

When to Use

Use this skill when:

  • Ending a significant work session (run retrospective)
  • A skill invocation failed or underperformed
  • During optimization sprints or quality reviews
  • Before major deployments (validate skill health)
  • After noticing repeated issues with a skill

Don't use when:

  • Quick 1-2 turn interactions
  • Non-skill-related tasks
  • Already in a retrospective session

Core Concepts

The Continual Learning Loop

┌──────────────────────────────────────────────────────────────┐
│ CONTINUAL LEARNING LOOP │
├──────────────────────────────────────────────────────────────┤
│ │
│ SESSION START │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ Load Baseline │◀──────────────────────────────┐ │
│ │ Learnings │ │ │
│ └───────┬───────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌───────────────┐ │ │
│ │ Apply Known │ │ │
│ │ Optimizations │ │ │
│ └───────┬───────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌───────────────┐ │ │
│ │ Track Skill │ │ │
│ │ Invocations │ │ │
│ └───────┬───────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌───────────────┐ │ │
│ │ Detect │ │ │
│ │ Patterns │──▶ Real-time alerts │ │
│ └───────┬───────┘ │ │
│ │ │ │
│ SESSION END │ │
│ │ │ │
│ ▼ │ │
│ ┌───────────────┐ │ │
│ │ Run │ │ │
│ │ Retrospective │ │ │
│ └───────┬───────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌───────────────┐ ┌───────────────┐ │ │
│ │ Generate │────▶│ Update │────────┘ │
│ │ Improvements │ │ skill-learnings.json │
│ └───────────────┘ └───────────────┘ │
│ │
└──────────────────────────────────────────────────────────────┘

Skill Effectiveness Scoring

Skills are scored based on invocation outcomes:

MetricWeightDescription
Success Rate40%% of invocations completing successfully
Error Rate30%% of invocations with errors
Retry Rate20%% requiring multiple attempts
User Satisfaction10%Explicit feedback (if available)

Score Thresholds:

  • 90-100%: Excellent - maintain current state
  • 70-89%: Good - minor optimizations
  • 50-69%: Needs Work - improve triggers and examples
  • <50%: Critical - major revision required

Learning Loop Workflow

Phase 1: Session Start - Load Baselines

At session start, load accumulated learnings:

# Check skill health before starting
python3 hooks/session-retrospective.py --analyze-skills --json

What to Load:

  1. skill-learnings.json - Historical success/failure patterns
  2. skill-patterns.json - Known anti-patterns to avoid
  3. Recent session summaries - Context from recent work

Apply Optimizations:

  • If a skill has known failure modes, add explicit warnings
  • If a skill has declining performance, suggest alternatives
  • If a skill has recurring errors, include workarounds

Phase 2: Real-Time Tracking

During session, track skill invocations:

Signals to Monitor:

  • Skill invocation commands (/skill-name, Use X skill)
  • Success indicators (checkmarks, "completed", "done")
  • Failure indicators (errors, retries, "failed")
  • User frustration signals (restarts, clarifications)

Anti-Pattern Detection:

Anti-PatternSignalIntervention
Excessive Retries3+ attemptsSuggest narrower scope
Context ConfusionMisunderstood taskClarify skill boundaries
Tool MisuseWrong tool selectedReview trigger criteria
Incomplete OutputMissing elementsAdd completion checklist
Hallucination RiskUngrounded claimsAdd verification steps

Phase 3: Session End - Retrospective

At session end, run full retrospective:

# Automatic retrospective
python3 hooks/session-retrospective.py --event session.end

# Manual with specific session
python3 hooks/session-retrospective.py --manual --session-id SESSION_ID

Retrospective Outputs:

  1. Skill Scores - Effectiveness rating per skill
  2. Anti-patterns - Detected problematic patterns
  3. Improvements - Specific recommendations
  4. Learnings Update - Updates to skill-learnings.json

Phase 4: Apply Improvements

For skills needing improvement:

# Generate improvement recommendations
python3 hooks/session-retrospective.py --optimize-now

# View specific skill learnings
python3 hooks/session-retrospective.py --analyze-skills --skill SKILL_NAME

Improvement Actions:

  1. Update "When to Use" Section

    • Add explicit triggers that worked
    • Add explicit anti-triggers for failures
    • Narrow scope if confusion occurred
  2. Add Examples

    • Working examples from successful invocations
    • Error cases with solutions
    • Edge cases discovered
  3. Add Troubleshooting

    • Common errors and solutions
    • Known limitations
    • Workarounds
  4. Update Metadata

    • Adjust token estimates
    • Update confidence scores
    • Add learned keywords

Anti-Pattern Detection

Detection Patterns

anti_patterns = {
"excessive_retries": {
"pattern": r'(?:retry|again|once more|try again)',
"threshold": 3,
"description": "Multiple retries indicate unclear instructions",
"recommendation": "Add clearer examples and edge case handling"
},
"context_confusion": {
"pattern": r'(?:confused|unclear|ambiguous|misunderstood)',
"threshold": 1,
"description": "Context confusion indicates poor skill scoping",
"recommendation": "Narrow skill scope and add explicit boundaries"
},
"tool_misuse": {
"pattern": r'(?:wrong tool|incorrect tool|should have used)',
"threshold": 1,
"description": "Tool selection issues indicate weak skill triggers",
"recommendation": "Improve 'When to Use' section with clearer triggers"
},
"incomplete_output": {
"pattern": r'(?:incomplete|partial|missing|forgot)',
"threshold": 2,
"description": "Incomplete outputs indicate missing checklist items",
"recommendation": "Add completion checklist to skill"
},
"hallucination_risk": {
"pattern": r'(?:assumed|guessed|made up|invented)',
"threshold": 1,
"description": "Assumptions indicate insufficient grounding",
"recommendation": "Add verification steps and source references"
}
}

Intervention Strategies

For Excessive Retries:

## Learned: Common Retry Scenarios

When this skill is invoked and requires retry:
1. Check if context is complete (all required files read)
2. Verify tool permissions are available
3. Confirm scope is narrow enough for single execution

If retry needed:
- Document what was missing
- Add to skill's prerequisites list

For Context Confusion:

## Skill Boundaries (Learned)

This skill handles:
- [Specific scope learned from success]

This skill does NOT handle:
- [Learned from confusion cases]

If unclear, use /which command to find correct skill.

Skill Update Protocol

When to Update Skills

  1. Automatic Updates (via hook)

    • Add learned anti-patterns
    • Update effectiveness scores
    • Append error solutions
  2. Manual Updates (via PR)

    • Major scope changes
    • New examples from sessions
    • Structural improvements

Update Format

## Learned Improvements

**Last Updated:** [ISO timestamp]
**Sessions Analyzed:** [count]
**Effectiveness Score:** [X]%

### Success Patterns (from learnings)
- [Pattern that worked well]
- [Context that led to success]

### Failure Patterns (avoid these)
- [Pattern that caused failures]
- [Context that led to confusion]

### Common Errors & Solutions
1. **Error:** [error message]
**Solution:** [learned solution]

2. **Error:** [error message]
**Solution:** [learned solution]

Integration Points

With /cx and /cxq

# Capture session for retrospective
/cx # Saves to context.db

# Query past learnings
/cxq --recall "skill optimization patterns"

With MoE Task Execution

# During /orient, load skill baselines
/orient
# Step 1: Check skill learnings
# Step 2: Apply known optimizations
# Step 3: Begin task execution

With Session Hooks

# Configure in settings.json
{
"hooks": {
"PostSession": [
{
"command": "python3 hooks/session-retrospective.py --event session.end",
"async": true,
"timeout": 60
}
]
}
}

Quick Reference

Commands

CommandPurpose
/optimize-skillsReal-time skill optimization
/retrospectiveRun session retrospective
python3 hooks/session-retrospective.py --analyze-skillsView all skill learnings
python3 hooks/session-retrospective.py --optimize-nowGet improvement recommendations

Files

FilePurpose
context-storage/skill-learnings.jsonAccumulated skill learnings
context-storage/skill-patterns.jsonDetected patterns
hooks/session-retrospective.pyMain retrospective hook

Metrics to Track

MetricTargetWarning
Overall Success Rate>85%<70%
Anti-patterns/Session<2>5
Skills Needing Work<20%>40%
Improvement Backlog<10>25

Example Session

# Session Retrospective Example

## Session: 2025-01-01-1430
Duration: 45 minutes
Skills Invoked: 8

## Skill Performance

| Skill | Invocations | Success | Score |
|-------|-------------|---------|-------|
| git-workflow | 3 | 3 | 100% |
| code-editor | 5 | 4 | 80% |
| deployment | 2 | 1 | 50% |

## Anti-patterns Detected

1. **excessive_retries** (3 occurrences)
- Skill: deployment
- Cause: Missing prerequisite context
- Action: Add pre-flight checklist

## Improvements Generated

1. [HIGH] deployment skill needs work
- Add Kubernetes context verification
- Include rollback examples
- Document common GKE errors

2. [MEDIUM] code-editor partial failure
- Add multi-file coordination examples
- Include undo/rollback patterns

## Learnings Applied

- Updated skill-learnings.json
- Added 3 new error solutions
- Tracked 2 new anti-pattern occurrences

Best Practices

  1. Run Retrospective After Significant Sessions

    • 30+ minute sessions
    • Multiple skill invocations
    • Any failures or confusion
  2. Act on High-Priority Improvements

    • Fix skills with <70% success rate
    • Address recurring anti-patterns
    • Update frequently-used skills first
  3. Maintain Learning History

    • Don't delete old learnings
    • Track trends over time
    • Build institutional knowledge
  4. Close the Loop

    • Apply learnings at session start
    • Verify improvements in next session
    • Measure impact of changes

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: skill-improvement-tracker

Completed:
- [x] Session retrospective executed
- [x] Skill performance metrics calculated
- [x] Anti-patterns detected and categorized
- [x] Improvement recommendations generated
- [x] skill-learnings.json updated
- [x] Skill health dashboard reviewed

Outputs:
- context-storage/skill-learnings.json (updated)
- Retrospective report with skill scores
- List of skills needing improvement
- Actionable improvement recommendations

Metrics:
- Skills analyzed: X
- Overall success rate: Y%
- Anti-patterns detected: Z
- Improvements recommended: N

Completion Checklist

Before marking this skill as complete, verify:

  • Retrospective hook executed successfully
  • All skill invocations analyzed
  • Success/failure markers detected
  • Skill scores calculated (0-100%)
  • Anti-patterns categorized by type
  • skill-learnings.json file updated
  • Improvement recommendations generated
  • High-priority skills identified (<70% score)
  • Trends analyzed across multiple sessions
  • Next session baseline prepared

Failure Indicators

This skill has FAILED if:

  • ❌ Cannot access context database (context.db)
  • ❌ skill-learnings.json is corrupted or missing
  • ❌ No skill invocations detected in session
  • ❌ Anti-pattern detection produces false positives
  • ❌ Retrospective hook not configured properly
  • ❌ Skill scores calculation errors
  • ❌ Cannot write to skill-learnings.json
  • ❌ Circular dependencies in skill improvement logic
  • ❌ Memory/performance issues analyzing large sessions

When NOT to Use

Do NOT use this skill when:

  • Quick 1-2 turn conversations (insufficient data)
  • No skills were invoked during session
  • Session focused on non-skill tasks (chat, exploration)
  • Already in a retrospective/analysis session (avoid recursion)
  • Testing new skills before baseline exists
  • Alternative: Use manual skill review for one-off debugging

Use alternative approaches for:

  • Real-time skill optimization → /optimize-skills command
  • Individual skill debugging → Manual skill inspection
  • Performance profiling → Separate profiling tools
  • Quality assessment → Skill validation checklist

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Running on every sessionNoise, wasted compute on trivial sessionsOnly run on 30+ min sessions with multiple skills
Ignoring recommendationsSkills never improve, same errors repeatAct on high-priority improvements within 1 week
Over-optimizing metricsFocus on scores instead of real improvementBalance quantitative scores with qualitative feedback
Deleting old learningsLose trend data, can't track improvementArchive old data, never delete
Manual skill updates without retrospectiveChanges not data-driven, may worsen performanceAlways run retrospective before major skill changes
Recursive analysisSkill analyzing itself causes loopsExclude skill-improvement-tracker from its own analysis
No verification loopChanges not validated in next sessionAlways measure impact of improvements
Treating all anti-patterns equallyMiss critical vs minor issuesPrioritize by severity and frequency

Principles

This skill embodies these CODITECT principles:

  • #10 Iterative Refinement - Continuous skill improvement through data
  • #6 Clear, Understandable, Explainable - Transparent reasoning chains for improvements
  • #8 No Assumptions - Data-driven decisions, not hunches
  • #5 Eliminate Ambiguity - Explicit success/failure markers
  • Automation - Autonomous learning loop with minimal human intervention
  • Compound Intelligence - Every session makes future sessions better

Related Standards:


Status: Production-ready Integration: Hooks, Commands, Skills Automation Level: Semi-autonomous (human review for skill updates)