Agent Skills Framework Extension

Debugging Patterns Skill

When to Use This Skill

Use this skill when implementing debugging patterns patterns in your codebase.

How to Use This Skill

Review the patterns and examples below
Apply the relevant patterns to your implementation
Follow the best practices outlined in this skill

Systematic debugging strategies, root cause analysis, and production-grade troubleshooting.

Core Capabilities

Root Cause Analysis - Systematic problem isolation and diagnosis
Performance Profiling - CPU, memory, I/O bottleneck identification
Memory Debugging - Leak detection, use-after-free, buffer overflows
Distributed Debugging - Tracing, correlation, causality analysis
Production Debugging - Live system troubleshooting, minimal impact

Systematic Debugging Framework

# debugging/systematic_approach.py
"""
Structured debugging methodology
"""
from dataclasses import dataclass
from typing import List, Optional, Dict, Any
from enum import Enum
import subprocess
import json

class DebugPhase(Enum):
    REPRODUCE = "reproduce"
    ISOLATE = "isolate"
    ANALYZE = "analyze"
    FIX = "fix"
    VERIFY = "verify"

@dataclass
class DebugHypothesis:
    """Testable hypothesis about bug cause"""
    description: str
    likelihood: float  # 0.0 to 1.0
    test_procedure: str
    expected_outcome: str
    actual_outcome: Optional[str] = None
    validated: Optional[bool] = None

@dataclass
class DebugSession:
    """Track debugging session state"""
    issue_description: str
    reproduction_steps: List[str]
    phase: DebugPhase
    hypotheses: List[DebugHypothesis]
    observations: List[str]
    fix_applied: Optional[str] = None

class SystematicDebugger:
    """Framework for systematic debugging"""

    def __init__(self):
        self.session = None

    def start_session(self, issue: str) -> DebugSession:
        """Initialize debugging session"""
        self.session = DebugSession(
            issue_description=issue,
            reproduction_steps=[],
            phase=DebugPhase.REPRODUCE,
            hypotheses=[],
            observations=[]
        )
        return self.session

    def reproduce(self, steps: List[str]) -> bool:
        """Attempt to reproduce the issue"""
        self.session.reproduction_steps = steps
        self.session.phase = DebugPhase.ISOLATE

        print("Reproduction steps:")
        for i, step in enumerate(steps, 1):
            print(f"  {i}. {step}")

        # Execute reproduction
        # Returns True if successfully reproduced
        return True

    def isolate_component(self) -> str:
        """Narrow down which component is failing"""
        self.session.phase = DebugPhase.ISOLATE

        # Binary search through system components
        # Divide and conquer to find failing subsystem

        isolation_strategies = [
            "Remove external dependencies",
            "Run with minimal configuration",
            "Test individual modules in isolation",
            "Check with known-good data",
            "Bisect git history (git bisect)"
        ]

        return isolation_strategies

    def form_hypotheses(self, observations: List[str]) -> List[DebugHypothesis]:
        """Generate testable hypotheses"""
        self.session.observations.extend(observations)
        self.session.phase = DebugPhase.ANALYZE

        hypotheses = []

        # Based on observations, form hypotheses
        # Example: If seeing null pointer error
        hypotheses.append(DebugHypothesis(
            description="Uninitialized variable accessed",
            likelihood=0.7,
            test_procedure="Add null checks, examine initialization",
            expected_outcome="Crash at initialization site"
        ))

        self.session.hypotheses.extend(hypotheses)
        return hypotheses

    def test_hypothesis(self, hypothesis: DebugHypothesis) -> bool:
        """Execute hypothesis test"""

        print(f"Testing: {hypothesis.description}")
        print(f"Procedure: {hypothesis.test_procedure}")

        # Execute test
        # Record actual outcome
        # Validate or reject hypothesis

        return True

    def apply_fix(self, fix_description: str, code_changes: str):
        """Apply and verify fix"""
        self.session.phase = DebugPhase.FIX
        self.session.fix_applied = fix_description

        print(f"Applying fix: {fix_description}")
        print(f"Changes:\n{code_changes}")

    def verify_fix(self) -> bool:
        """Verify fix resolves issue"""
        self.session.phase = DebugPhase.VERIFY

        # Re-run reproduction steps
        # Should no longer reproduce

        # Run regression tests
        # Ensure no new issues introduced

        return True

    def generate_report(self) -> Dict[str, Any]:
        """Generate debugging session report"""
        return {
            'issue': self.session.issue_description,
            'reproduction_steps': self.session.reproduction_steps,
            'observations': self.session.observations,
            'hypotheses': [
                {
                    'description': h.description,
                    'validated': h.validated,
                    'likelihood': h.likelihood
                }
                for h in self.session.hypotheses
            ],
            'fix': self.session.fix_applied
        }

Performance Profiling

// src/profiling/performance.rs
use std::time::{Duration, Instant};
use std::collections::HashMap;

/// CPU profiling with flame graph generation
pub struct Profiler {
    samples: Vec<StackSample>,
    sampling_rate: Duration,
}

struct StackSample {
    timestamp: Instant,
    stack_trace: Vec<String>,
}

impl Profiler {
    pub fn new(sampling_rate_ms: u64) -> Self {
        Self {
            samples: Vec::new(),
            sampling_rate: Duration::from_millis(sampling_rate_ms),
        }
    }

    pub fn start_sampling(&mut self) {
        // Start background thread to sample stack traces
        // Use backtrace crate for stack unwinding
    }

    pub fn stop_and_analyze(&self) -> ProfilingReport {
        let mut function_time = HashMap::new();

        // Aggregate samples per function
        for sample in &self.samples {
            for func in &sample.stack_trace {
                *function_time.entry(func.clone()).or_insert(0) += 1;
            }
        }

        // Calculate percentages
        let total_samples = self.samples.len() as f64;

        let mut hot_spots: Vec<_> = function_time
            .into_iter()
            .map(|(func, count)| {
                let percentage = (count as f64 / total_samples) * 100.0;
                (func, percentage)
            })
            .collect();

        hot_spots.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());

        ProfilingReport {
            total_samples: self.samples.len(),
            hot_spots,
        }
    }
}

pub struct ProfilingReport {
    total_samples: usize,
    hot_spots: Vec<(String, f64)>,
}

impl ProfilingReport {
    pub fn print(&self) {
        println!("=== Profiling Report ===");
        println!("Total samples: {}", self.total_samples);
        println!("\nTop 10 hot spots:");

        for (i, (func, pct)) in self.hot_spots.iter().take(10).enumerate() {
            println!("  {}. {:<50} {:>6.2}%", i + 1, func, pct);
        }
    }
}

/// Memory leak detection
pub struct MemoryTracker {
    allocations: HashMap<usize, AllocationInfo>,
}

struct AllocationInfo {
    size: usize,
    stack_trace: Vec<String>,
    timestamp: Instant,
}

impl MemoryTracker {
    pub fn new() -> Self {
        Self {
            allocations: HashMap::new(),
        }
    }

    pub fn track_allocation(&mut self, ptr: usize, size: usize) {
        let stack_trace = vec!["TODO: capture backtrace".to_string()];

        self.allocations.insert(ptr, AllocationInfo {
            size,
            stack_trace,
            timestamp: Instant::now(),
        });
    }

    pub fn track_deallocation(&mut self, ptr: usize) {
        self.allocations.remove(&ptr);
    }

    pub fn report_leaks(&self) -> Vec<String> {
        let mut leaks = Vec::new();

        for (ptr, info) in &self.allocations {
            let age = info.timestamp.elapsed();

            // Report allocations older than threshold
            if age > Duration::from_secs(300) {  // 5 minutes
                leaks.push(format!(
                    "Leak: {:?} - {} bytes - age: {:?}",
                    ptr, info.size, age
                ));
            }
        }

        leaks
    }
}

// Benchmark macro for performance testing
#[macro_export]
macro_rules! benchmark {
    ($name:expr, $code:block) => {{
        let start = std::time::Instant::now();
        let result = $code;
        let duration = start.elapsed();
        println!("{}: {:?}", $name, duration);
        result
    }};
}

Production Debugging Tools

#!/bin/bash
# scripts/production_debug.sh
# Safe production debugging toolkit

set -euo pipefail

# Configuration
NAMESPACE="production"
POD_SELECTOR="app=myapp"
LOG_LINES=1000

print_section() {
    echo "======================================"
    echo "$1"
    echo "======================================"
}

# 1. Health Check
health_check() {
    print_section "Health Check"

    kubectl get pods -n $NAMESPACE -l $POD_SELECTOR

    # Check pod status
    kubectl get pods -n $NAMESPACE -l $POD_SELECTOR \
        -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.phase}{"\n"}{end}'
}

# 2. Resource Usage
resource_usage() {
    print_section "Resource Usage"

    kubectl top pods -n $NAMESPACE -l $POD_SELECTOR

    # Detailed metrics
    kubectl get pods -n $NAMESPACE -l $POD_SELECTOR \
        -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}CPU: {.status.containerStatuses[0].usage.cpu}{"\t"}Memory: {.status.containerStatuses[0].usage.memory}{"\n"}{end}'
}

# 3. Recent Logs
recent_logs() {
    print_section "Recent Logs"

    POD=$(kubectl get pods -n $NAMESPACE -l $POD_SELECTOR \
        -o jsonpath='{.items[0].metadata.name}')

    kubectl logs -n $NAMESPACE $POD --tail=$LOG_LINES

    # Error analysis
    echo -e "\nError Summary:"
    kubectl logs -n $NAMESPACE $POD --tail=$LOG_LINES | \
        grep -i error | sort | uniq -c | sort -rn | head -10
}

# 4. Application Metrics
app_metrics() {
    print_section "Application Metrics"

    POD=$(kubectl get pods -n $NAMESPACE -l $POD_SELECTOR \
        -o jsonpath='{.items[0].metadata.name}')

    # Scrape /metrics endpoint
    kubectl exec -n $NAMESPACE $POD -- \
        curl -s http://localhost:8080/metrics
}

# 5. Network Connectivity
network_check() {
    print_section "Network Connectivity"

    POD=$(kubectl get pods -n $NAMESPACE -l $POD_SELECTOR \
        -o jsonpath='{.items[0].metadata.name}')

    # Check DNS
    kubectl exec -n $NAMESPACE $POD -- nslookup google.com

    # Check service endpoints
    kubectl get endpoints -n $NAMESPACE
}

# 6. Configuration Verification
config_check() {
    print_section "Configuration Check"

    # Check ConfigMaps
    kubectl get configmaps -n $NAMESPACE

    # Check Secrets (names only, not values)
    kubectl get secrets -n $NAMESPACE
}

# 7. Event Log
events() {
    print_section "Recent Events"

    kubectl get events -n $NAMESPACE --sort-by='.lastTimestamp' | tail -20
}

# Main execution
main() {
    health_check
    resource_usage
    recent_logs
    app_metrics
    network_check
    config_check
    events
}

main "$@"

Distributed Tracing Debug

// src/tracing/distributed_debug.rs
use opentelemetry::{trace::Tracer, global};
use tracing::{info, error, instrument};
use tracing_opentelemetry::OpenTelemetrySpanExt;
use uuid::Uuid;

/// Distributed request tracing for debugging
pub struct DistributedDebugger {
    tracer: Box<dyn Tracer + Send + Sync>,
}

impl DistributedDebugger {
    #[instrument(skip(self))]
    pub async fn trace_request(
        &self,
        request_id: Uuid,
        service_name: &str,
    ) -> Result<(), String> {
        // Create span for this operation
        let span = info_span!(
            "process_request",
            request_id = %request_id,
            service = %service_name
        );

        let _guard = span.enter();

        info!("Processing request in {}", service_name);

        // Simulate work
        self.call_dependency("database", request_id).await?;
        self.call_dependency("cache", request_id).await?;

        info!("Request completed successfully");

        Ok(())
    }

    #[instrument(skip(self))]
    async fn call_dependency(
        &self,
        dependency: &str,
        request_id: Uuid,
    ) -> Result<(), String> {
        info!("Calling dependency: {}", dependency);

        // Propagate trace context to downstream service
        // via HTTP headers (W3C Trace Context)

        Ok(())
    }
}

Usage Examples

Systematic Debugging

Apply debugging-patterns skill to debug production memory leak using systematic approach

Performance Profiling

Apply debugging-patterns skill to profile CPU hot spots and generate flame graph

Production Debugging

Apply debugging-patterns skill to diagnose production issue with minimal system impact

Integration Points

error-debugging-patterns - Error analysis
rust-development-patterns - Rust-specific debugging
rust-qa-patterns - Testing integration

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: debugging-patterns

Completed:
- [x] Issue reproduced with documented steps
- [x] Component isolation completed (failing subsystem identified)
- [x] Hypotheses formed and tested systematically
- [x] Root cause identified with evidence
- [x] Fix applied and verified with regression tests
- [x] Debugging session report generated

Outputs:
- debug-session-report.json (issue, hypotheses, observations, fix)
- reproduction-steps.md (exact steps to reproduce issue)
- profiling-report.txt (performance hot spots if applicable)
- memory-leak-report.txt (allocation tracking if applicable)

Debug Session Summary:
Issue: [Description of original issue]
Root Cause: [Identified root cause with evidence]
Fix Applied: [Description of fix with code changes]
Verification: [Test results showing fix resolves issue]

Performance Metrics (if applicable):
- Top hot spot: [Function name] - XX.X% CPU time
- Memory leak: [Allocation site] - XXX bytes leaked

Completion Checklist

Before marking this skill as complete, verify:

Failure Indicators

This skill has FAILED if:

❌ Issue cannot be reproduced consistently (intermittent or environmental)
❌ Component isolation incomplete (failing subsystem not identified)
❌ No hypotheses formed (skipped systematic analysis)
❌ Root cause not identified (guessing at fixes)
❌ Fix not verified (assumed fixed without re-testing)
❌ Regression tests not added (issue will recur)
❌ Debugging session report missing (no documentation of findings)
❌ Production debugging performed without safety checks (system impacted)

When NOT to Use

Do NOT use this skill when:

Issue is a known bug with documented fix (apply fix directly)
Problem is configuration error (use configuration validation instead)
Issue is user error or misuse (provide documentation/training)
Bug is in third-party library (report upstream, apply workaround)
Use error-debugging-patterns for simple error message analysis
Use performance-optimization-patterns for optimization without bugs

Alternative skills for different debugging needs:

error-debugging-patterns - Error message analysis and stack trace interpretation
performance-profiling-patterns - CPU/memory profiling for optimization
distributed-tracing-patterns - Debugging across microservices
production-debugging-patterns - Safe production troubleshooting

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Skipping reproduction	Cannot verify fix works	Always reproduce issue first before debugging
Guessing at fixes	Wastes time, introduces new bugs	Form testable hypotheses, validate systematically
Not isolating components	Debug entire system, inefficient	Binary search to isolate failing subsystem
Changing multiple things at once	Cannot determine what fixed issue	Change one variable at a time, test hypothesis
Assuming fix works	Issue recurs later	Always verify fix with reproduction steps
No regression tests	Bug returns in future	Add tests to prevent recurrence
Debugging in production without backups	Data loss, system crashes	Use read-only debugging, staging environments
Skipping documentation	Knowledge lost, issue repeats	Document findings, root cause, and fix

Principles

This skill embodies CODITECT foundational principles:

#2 First Principles Thinking

Systematic debugging methodology (reproduce → isolate → analyze → fix → verify)
Binary search for component isolation (divide and conquer)
Hypothesis-driven testing (not random trial and error)

#5 Eliminate Ambiguity

Explicit reproduction steps (no "it just crashes sometimes")
Testable hypotheses with expected outcomes
Clear root cause identification with evidence

#6 Clear, Understandable, Explainable

Debugging session report documents entire investigation
Reproduction steps written for any developer to follow
Root cause explained with reasoning chain

#7 Measurable Outcomes

Performance profiling with quantitative metrics (CPU %, memory usage)
Memory leak detection with byte counts and allocation sites
Fix verification with before/after test results

#8 No Assumptions

Reproduce issue before debugging (don't assume cause)
Test hypotheses systematically (don't assume first guess is correct)
Verify fix works (don't assume it's fixed without testing)

Full Principles: CODITECT-STANDARD-AUTOMATION.md

Version: 1.1.0 | Updated: 2026-01-04 | Author: CODITECT Team

When to Use This Skill​

How to Use This Skill​

Core Capabilities​

Systematic Debugging Framework​

Performance Profiling​

Production Debugging Tools​

Distributed Tracing Debug​

Usage Examples​

Systematic Debugging​

Performance Profiling​

Production Debugging​

Integration Points​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​

When to Use This Skill

How to Use This Skill

Core Capabilities

Systematic Debugging Framework

Performance Profiling

Production Debugging Tools

Distributed Tracing Debug

Usage Examples

Systematic Debugging

Performance Profiling

Production Debugging

Integration Points

Success Output

Completion Checklist

Failure Indicators

When NOT to Use

Anti-Patterns (Avoid)

Principles