Skill Improvement Tracker

Continuously track, analyze, and improve skills based on real session outcomes. This skill implements the Continual Learning Loop where every session compounds into better future skills.

When to Use

Use this skill when:

Ending a significant work session (run retrospective)
A skill invocation failed or underperformed
During optimization sprints or quality reviews
Before major deployments (validate skill health)
After noticing repeated issues with a skill

Don't use when:

Quick 1-2 turn interactions
Non-skill-related tasks
Already in a retrospective session

Core Concepts

The Continual Learning Loop

┌──────────────────────────────────────────────────────────────┐
│                    CONTINUAL LEARNING LOOP                    │
├──────────────────────────────────────────────────────────────┤
│                                                               │
│    SESSION START                                              │
│         │                                                     │
│         ▼                                                     │
│   ┌───────────────┐                                          │
│   │ Load Baseline │◀──────────────────────────────┐          │
│   │  Learnings    │                               │          │
│   └───────┬───────┘                               │          │
│           │                                       │          │
│           ▼                                       │          │
│   ┌───────────────┐                               │          │
│   │ Apply Known   │                               │          │
│   │ Optimizations │                               │          │
│   └───────┬───────┘                               │          │
│           │                                       │          │
│           ▼                                       │          │
│   ┌───────────────┐                               │          │
│   │ Track Skill   │                               │          │
│   │ Invocations   │                               │          │
│   └───────┬───────┘                               │          │
│           │                                       │          │
│           ▼                                       │          │
│   ┌───────────────┐                               │          │
│   │ Detect        │                               │          │
│   │ Patterns      │──▶ Real-time alerts           │          │
│   └───────┬───────┘                               │          │
│           │                                       │          │
│    SESSION END                                    │          │
│           │                                       │          │
│           ▼                                       │          │
│   ┌───────────────┐                               │          │
│   │ Run           │                               │          │
│   │ Retrospective │                               │          │
│   └───────┬───────┘                               │          │
│           │                                       │          │
│           ▼                                       │          │
│   ┌───────────────┐     ┌───────────────┐        │          │
│   │ Generate      │────▶│ Update        │────────┘          │
│   │ Improvements  │     │ skill-learnings.json              │
│   └───────────────┘     └───────────────┘                    │
│                                                               │
└──────────────────────────────────────────────────────────────┘

Skill Effectiveness Scoring

Skills are scored based on invocation outcomes:

Metric	Weight	Description
Success Rate	40%	% of invocations completing successfully
Error Rate	30%	% of invocations with errors
Retry Rate	20%	% requiring multiple attempts
User Satisfaction	10%	Explicit feedback (if available)

Score Thresholds:

90-100%: Excellent - maintain current state
70-89%: Good - minor optimizations
50-69%: Needs Work - improve triggers and examples
<50%: Critical - major revision required

Learning Loop Workflow

Phase 1: Session Start - Load Baselines

At session start, load accumulated learnings:

# Check skill health before starting
python3 hooks/session-retrospective.py --analyze-skills --json

What to Load:

skill-learnings.json - Historical success/failure patterns
skill-patterns.json - Known anti-patterns to avoid
Recent session summaries - Context from recent work

Apply Optimizations:

If a skill has known failure modes, add explicit warnings
If a skill has declining performance, suggest alternatives
If a skill has recurring errors, include workarounds

Phase 2: Real-Time Tracking

During session, track skill invocations:

Signals to Monitor:

Skill invocation commands (/skill-name, Use X skill)
Success indicators (checkmarks, "completed", "done")
Failure indicators (errors, retries, "failed")
User frustration signals (restarts, clarifications)

Anti-Pattern Detection:

Anti-Pattern	Signal	Intervention
Excessive Retries	3+ attempts	Suggest narrower scope
Context Confusion	Misunderstood task	Clarify skill boundaries
Tool Misuse	Wrong tool selected	Review trigger criteria
Incomplete Output	Missing elements	Add completion checklist
Hallucination Risk	Ungrounded claims	Add verification steps

Phase 3: Session End - Retrospective

At session end, run full retrospective:

# Automatic retrospective
python3 hooks/session-retrospective.py --event session.end

# Manual with specific session
python3 hooks/session-retrospective.py --manual --session-id SESSION_ID

Retrospective Outputs:

Skill Scores - Effectiveness rating per skill
Anti-patterns - Detected problematic patterns
Improvements - Specific recommendations
Learnings Update - Updates to skill-learnings.json

Phase 4: Apply Improvements

For skills needing improvement:

# Generate improvement recommendations
python3 hooks/session-retrospective.py --optimize-now

# View specific skill learnings
python3 hooks/session-retrospective.py --analyze-skills --skill SKILL_NAME

Improvement Actions:

Update "When to Use" Section
- Add explicit triggers that worked
- Add explicit anti-triggers for failures
- Narrow scope if confusion occurred
Add Examples
- Working examples from successful invocations
- Error cases with solutions
- Edge cases discovered
Add Troubleshooting
- Common errors and solutions
- Known limitations
- Workarounds
Update Metadata
- Adjust token estimates
- Update confidence scores
- Add learned keywords

Anti-Pattern Detection

Detection Patterns

anti_patterns = {
    "excessive_retries": {
        "pattern": r'(?:retry|again|once more|try again)',
        "threshold": 3,
        "description": "Multiple retries indicate unclear instructions",
        "recommendation": "Add clearer examples and edge case handling"
    },
    "context_confusion": {
        "pattern": r'(?:confused|unclear|ambiguous|misunderstood)',
        "threshold": 1,
        "description": "Context confusion indicates poor skill scoping",
        "recommendation": "Narrow skill scope and add explicit boundaries"
    },
    "tool_misuse": {
        "pattern": r'(?:wrong tool|incorrect tool|should have used)',
        "threshold": 1,
        "description": "Tool selection issues indicate weak skill triggers",
        "recommendation": "Improve 'When to Use' section with clearer triggers"
    },
    "incomplete_output": {
        "pattern": r'(?:incomplete|partial|missing|forgot)',
        "threshold": 2,
        "description": "Incomplete outputs indicate missing checklist items",
        "recommendation": "Add completion checklist to skill"
    },
    "hallucination_risk": {
        "pattern": r'(?:assumed|guessed|made up|invented)',
        "threshold": 1,
        "description": "Assumptions indicate insufficient grounding",
        "recommendation": "Add verification steps and source references"
    }
}

Intervention Strategies

For Excessive Retries:

## Learned: Common Retry Scenarios

When this skill is invoked and requires retry:
1. Check if context is complete (all required files read)
2. Verify tool permissions are available
3. Confirm scope is narrow enough for single execution

If retry needed:
- Document what was missing
- Add to skill's prerequisites list

For Context Confusion:

## Skill Boundaries (Learned)

This skill handles:
- [Specific scope learned from success]

This skill does NOT handle:
- [Learned from confusion cases]

If unclear, use /which command to find correct skill.

Skill Update Protocol

When to Update Skills

Automatic Updates (via hook)
- Add learned anti-patterns
- Update effectiveness scores
- Append error solutions
Manual Updates (via PR)
- Major scope changes
- New examples from sessions
- Structural improvements

Update Format

## Learned Improvements

**Last Updated:** [ISO timestamp]
**Sessions Analyzed:** [count]
**Effectiveness Score:** [X]%

### Success Patterns (from learnings)
- [Pattern that worked well]
- [Context that led to success]

### Failure Patterns (avoid these)
- [Pattern that caused failures]
- [Context that led to confusion]

### Common Errors & Solutions
1. **Error:** [error message]
   **Solution:** [learned solution]

2. **Error:** [error message]
   **Solution:** [learned solution]

Integration Points

With `/cx` and `/cxq`

# Capture session for retrospective
/cx  # Saves to context.db

# Query past learnings
/cxq --recall "skill optimization patterns"

With MoE Task Execution

# During /orient, load skill baselines
/orient
# Step 1: Check skill learnings
# Step 2: Apply known optimizations
# Step 3: Begin task execution

With Session Hooks

# Configure in settings.json
{
  "hooks": {
    "PostSession": [
      {
        "command": "python3 hooks/session-retrospective.py --event session.end",
        "async": true,
        "timeout": 60
      }
    ]
  }
}

Quick Reference

Commands

Command	Purpose
`/optimize-skills`	Real-time skill optimization
`/retrospective`	Run session retrospective
`python3 hooks/session-retrospective.py --analyze-skills`	View all skill learnings
`python3 hooks/session-retrospective.py --optimize-now`	Get improvement recommendations

Files

File	Purpose
`context-storage/skill-learnings.json`	Accumulated skill learnings
`context-storage/skill-patterns.json`	Detected patterns
`hooks/session-retrospective.py`	Main retrospective hook

Metrics to Track

Metric	Target	Warning
Overall Success Rate	>85%	<70%
Anti-patterns/Session	<2	>5
Skills Needing Work	<20%	>40%
Improvement Backlog	<10	>25

Example Session

# Session Retrospective Example

## Session: 2025-01-01-1430
Duration: 45 minutes
Skills Invoked: 8

## Skill Performance

| Skill | Invocations | Success | Score |
|-------|-------------|---------|-------|
| git-workflow | 3 | 3 | 100% |
| code-editor | 5 | 4 | 80% |
| deployment | 2 | 1 | 50% |

## Anti-patterns Detected

1. **excessive_retries** (3 occurrences)
   - Skill: deployment
   - Cause: Missing prerequisite context
   - Action: Add pre-flight checklist

## Improvements Generated

1. [HIGH] deployment skill needs work
   - Add Kubernetes context verification
   - Include rollback examples
   - Document common GKE errors

2. [MEDIUM] code-editor partial failure
   - Add multi-file coordination examples
   - Include undo/rollback patterns

## Learnings Applied

- Updated skill-learnings.json
- Added 3 new error solutions
- Tracked 2 new anti-pattern occurrences

Best Practices

Run Retrospective After Significant Sessions
- 30+ minute sessions
- Multiple skill invocations
- Any failures or confusion
Act on High-Priority Improvements
- Fix skills with <70% success rate
- Address recurring anti-patterns
- Update frequently-used skills first
Maintain Learning History
- Don't delete old learnings
- Track trends over time
- Build institutional knowledge
Close the Loop
- Apply learnings at session start
- Verify improvements in next session
- Measure impact of changes

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: skill-improvement-tracker

Completed:
- [x] Session retrospective executed
- [x] Skill performance metrics calculated
- [x] Anti-patterns detected and categorized
- [x] Improvement recommendations generated
- [x] skill-learnings.json updated
- [x] Skill health dashboard reviewed

Outputs:
- context-storage/skill-learnings.json (updated)
- Retrospective report with skill scores
- List of skills needing improvement
- Actionable improvement recommendations

Metrics:
- Skills analyzed: X
- Overall success rate: Y%
- Anti-patterns detected: Z
- Improvements recommended: N

Completion Checklist

Before marking this skill as complete, verify:

Failure Indicators

This skill has FAILED if:

❌ Cannot access context database (context.db)
❌ skill-learnings.json is corrupted or missing
❌ No skill invocations detected in session
❌ Anti-pattern detection produces false positives
❌ Retrospective hook not configured properly
❌ Skill scores calculation errors
❌ Cannot write to skill-learnings.json
❌ Circular dependencies in skill improvement logic
❌ Memory/performance issues analyzing large sessions

When NOT to Use

Do NOT use this skill when:

Quick 1-2 turn conversations (insufficient data)
No skills were invoked during session
Session focused on non-skill tasks (chat, exploration)
Already in a retrospective/analysis session (avoid recursion)
Testing new skills before baseline exists
Alternative: Use manual skill review for one-off debugging

Use alternative approaches for:

Real-time skill optimization → /optimize-skills command
Individual skill debugging → Manual skill inspection
Performance profiling → Separate profiling tools
Quality assessment → Skill validation checklist

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Running on every session	Noise, wasted compute on trivial sessions	Only run on 30+ min sessions with multiple skills
Ignoring recommendations	Skills never improve, same errors repeat	Act on high-priority improvements within 1 week
Over-optimizing metrics	Focus on scores instead of real improvement	Balance quantitative scores with qualitative feedback
Deleting old learnings	Lose trend data, can't track improvement	Archive old data, never delete
Manual skill updates without retrospective	Changes not data-driven, may worsen performance	Always run retrospective before major skill changes
Recursive analysis	Skill analyzing itself causes loops	Exclude skill-improvement-tracker from its own analysis
No verification loop	Changes not validated in next session	Always measure impact of improvements
Treating all anti-patterns equally	Miss critical vs minor issues	Prioritize by severity and frequency

Principles

This skill embodies these CODITECT principles:

#10 Iterative Refinement - Continuous skill improvement through data
#6 Clear, Understandable, Explainable - Transparent reasoning chains for improvements
#8 No Assumptions - Data-driven decisions, not hunches
#5 Eliminate Ambiguity - Explicit success/failure markers
Automation - Autonomous learning loop with minimal human intervention
Compound Intelligence - Every session makes future sessions better

Related Standards:

Status: Production-ready Integration: Hooks, Commands, Skills Automation Level: Semi-autonomous (human review for skill updates)

When to Use​

Core Concepts​

The Continual Learning Loop​

Skill Effectiveness Scoring​

Learning Loop Workflow​

Phase 1: Session Start - Load Baselines​

Phase 2: Real-Time Tracking​

Phase 3: Session End - Retrospective​

Phase 4: Apply Improvements​

Anti-Pattern Detection​

Detection Patterns​

Intervention Strategies​

Skill Update Protocol​

When to Update Skills​

Update Format​

Integration Points​

With /cx and /cxq​

With MoE Task Execution​

With Session Hooks​

Quick Reference​

Commands​

Files​

Metrics to Track​

Example Session​

Best Practices​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​