/perf-profile Command

Execute comprehensive performance profiling for applications to identify CPU bottlenecks, memory issues, and I/O constraints with visual flame graph generation.

System Prompt

⚠️ EXECUTION DIRECTIVE: When the user invokes this command, you MUST:

IMMEDIATELY execute - no questions, no explanations first
ALWAYS show full output from script/tool execution
ALWAYS provide summary after execution completes

DO NOT:

Say "I don't need to take action" - you ALWAYS execute when invoked
Ask for confirmation unless requires_confirmation: true in frontmatter
Skip execution even if it seems redundant - run it anyway

The user invoking the command IS the confirmation.

Usage

/perf-profile [options] <target>

Options:
  --type <types>       Profile types: cpu,memory,io,async (default: cpu)
  --duration <secs>    Profiling duration in seconds (default: 30)
  --output <format>    Output: terminal,json,flamegraph,html (default: terminal)
  --threshold <ms>     Minimum time threshold for reporting (default: 10ms)
  --compare <file>     Compare with baseline profile
  --language <lang>    Target language: python,node,rust,go (auto-detect)

Profile Types

CPU Profiling

/perf-profile --type cpu <command>

# Python: py-spy, cProfile
# Node.js: clinic.js doctor, 0x
# Rust: cargo-flamegraph, perf
# Go: pprof

Memory Profiling

/perf-profile --type memory <command>

# Detects:
# - Memory leaks
# - High memory allocations
# - Object retention issues
# - Garbage collection pressure

I/O Profiling

/perf-profile --type io <command>

# Monitors:
# - File system operations
# - Network I/O
# - Database queries
# - API call latency

Async Profiling

/perf-profile --type async <command>

# Analyzes:
# - Event loop blocking
# - Promise/Future bottlenecks
# - Concurrent task performance
# - Thread pool utilization

Examples

Basic CPU Profile

# Profile a Python application
/perf-profile --type cpu python app.py

# Profile with flame graph output
/perf-profile --type cpu --output flamegraph --duration 60 node server.js

Memory Analysis

# Memory profile with leak detection
/perf-profile --type memory --duration 120 python process_data.py

# Memory comparison (before/after optimization)
/perf-profile --type memory --compare baseline.json python app.py

Production Profiling

# Attach to running process (Python)
/perf-profile --type cpu --pid 12345

# Sample-based profiling (low overhead)
/perf-profile --type cpu --sampling 100 cargo run --release

Load Test Profiling

# Profile under load
k6 run load-test.js &
/perf-profile --type cpu,memory --duration 300 python api_server.py

Output Formats

Terminal Summary

Performance Profile Results
===========================

Duration: 30.0s | Samples: 3000 | CPU: 45% avg

Top 10 Functions by CPU Time:
  1. process_batch      28.5%  8550ms  src/processor.py:45
  2. serialize_json     15.2%  4560ms  src/utils.py:123
  3. db_query           12.8%  3840ms  src/database.py:78
  4. validate_input      8.3%  2490ms  src/validators.py:34
  5. parse_request       6.1%  1830ms  src/api.py:56

Memory:
  Peak: 512MB | Average: 380MB | GC Pauses: 12

Recommendations:
  - Consider caching serialize_json results (15.2% CPU)
  - Batch db_query calls to reduce overhead
  - Review process_batch algorithm complexity

Flame Graph

/perf-profile --output flamegraph python app.py

# Generates: profile-flamegraph.svg
# Open in browser for interactive analysis

JSON Export

/perf-profile --output json python app.py > profile.json

# Structure:
# {
#   "duration_ms": 30000,
#   "samples": 3000,
#   "hotspots": [...],
#   "memory": {...},
#   "recommendations": [...]
# }

Language-Specific Tools

Python

# py-spy (sampling profiler)
/perf-profile --language python --type cpu app.py

# memory_profiler
/perf-profile --language python --type memory app.py

Node.js

# clinic.js (automatic analysis)
/perf-profile --language node --type cpu server.js

# 0x (flame graphs)
/perf-profile --language node --output flamegraph server.js

Rust

# cargo-flamegraph
/perf-profile --language rust --output flamegraph cargo run

# perf + FlameGraph
/perf-profile --language rust --type cpu target/release/app

Go

# pprof (built-in)
/perf-profile --language go --type cpu,memory ./app

Optimization Workflow

Baseline: /perf-profile --output json app > baseline.json
Identify hotspots from top functions list
Optimize targeted functions
Compare: /perf-profile --compare baseline.json app
Verify improvement percentage
Document optimization results

Integration

CI/CD Performance Gates

- name: Performance Regression Test
  run: |
    /perf-profile --type cpu --duration 60 --output json app > profile.json
    python scripts/check-perf-regression.py profile.json --threshold 10%

Continuous Profiling

# Schedule periodic profiling in production
/perf-profile --type cpu --duration 300 --output json \
  --pid $(pgrep -f "python app.py") > profiles/$(date +%Y%m%d-%H%M).json

performance-profiler-agent - Detailed profiling strategies
optimization-patterns-skill - Code optimization patterns
load-testing-skill - Load testing for performance validation
/dependency-audit - Performance-impacting dependencies

Action Policy

<default_behavior> This command executes performance profiling. Proceeds with:

Language detection
Profile collection
Hotspot analysis
Recommendation generation

Generates reports in specified format. </default_behavior>

After execution, verify: - Profile collected - Hotspots identified - Report generated - Recommendations provided

Success Output

When perf-profile completes:

✅ COMMAND COMPLETE: /perf-profile
Target: <command>
Duration: <N>s
Type: <cpu|memory|io|async>
Hotspots: <N> identified
Output: <format>

Completion Checklist

Before marking complete:

Target profiled
Data collected
Analysis complete
Report generated

Failure Indicators

This command has FAILED if:

❌ No target specified
❌ Profiling failed
❌ No data collected
❌ Report generation failed

When NOT to Use

Do NOT use when:

Quick timing needed (use time)
Production with high load
No profiling tools installed

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Profile in debug mode	Misleading data	Use release builds
Short duration	Insufficient samples	Use 30s+ duration
Skip baseline	No comparison	Create baseline first

Principles

This command embodies:

#3 Complete Execution - Full profiling workflow
#9 Based on Facts - Data-driven analysis
#6 Clear, Understandable - Clear reports

Full Standard: CODITECT-STANDARD-AUTOMATION.md

System Prompt​

Usage​

Profile Types​

CPU Profiling​

Memory Profiling​

I/O Profiling​

Async Profiling​

Examples​

Basic CPU Profile​

Memory Analysis​

Production Profiling​

Load Test Profiling​

Output Formats​

Terminal Summary​

Flame Graph​

JSON Export​

Language-Specific Tools​

Python​

Node.js​

Rust​

Go​

Optimization Workflow​

Integration​

CI/CD Performance Gates​

Continuous Profiling​

Related Components​

Action Policy​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​