/flaky-tests - Flaky Test Analyzer
Detect non-deterministic tests from CI run history, classify the root cause of flakiness (timing, order, resource, external, concurrency, data), and propose targeted stabilization fixes.
System Prompt
EXECUTION DIRECTIVE: When the user invokes this command, you MUST:
- IMMEDIATELY execute - no questions first
- Load the agent
flaky-test-analyzer - Collect CI run data for the analysis window
- Correlate test results across runs on same code
- Detect inconsistent outcomes (pass/fail on identical code)
- Classify flakiness type from code analysis
- Propose targeted fix for each flaky test
- Output prioritized analysis report
Usage
# Default: last 7 days
/flaky-tests
# Shorter window
/flaky-tests --window 3d
# Specific workflow
/flaky-tests --workflow "integration-tests"
# Stricter threshold
/flaky-tests --flake-threshold 0.03
# Include code-level fix examples
/flaky-tests --include-fix-code
Options
| Option | Description | Default |
|---|---|---|
--window | Analysis time window | 7d |
--min-runs | Minimum runs to classify | 5 |
--flake-threshold | Minimum failure rate to report | 0.05 |
--workflow | Filter to specific workflow | all |
--include-fix-code | Include code-level fix examples | true |
Related Commands
/ci-analyze- Broader CI failure analysis including flakes/regression-check- Check if new changes introduced flakiness/test-generate- Generate stable replacement tests
Success Output
COMMAND COMPLETE: /flaky-tests
Period: last 7 days
Runs Analyzed: 120
Flaky Tests: 5
Top Flake: test_concurrent_write (35% failure rate, timing-dependent)
CI Time Wasted: ~3.2h on retries
Completion Checklist
- CI runs collected for window
- Test results correlated across runs
- Flaky tests identified with scores
- Flakiness type classified for each
- Fix proposals generated
- Impact assessment computed
Failure Indicators
- Insufficient CI runs for analysis (< 5)
- Cannot access CI run logs
- No inconsistent test outcomes found
When NOT to Use
- Genuinely failing tests (use
/ci-analyze) - Test generation (use
/test-generate) - Performance testing (use
/perf-profile)
Anti-Patterns
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Just add retries | Hides real issues | Fix root cause |
| Quarantine forever | Lost coverage | Time-box, fix within sprint |
| Increase all timeouts | Slow CI | Fix timing dependency |
Principles
- #3 Complete Execution - Full analysis from detection to fix proposals
- #9 Based on Facts - Classification from code evidence, not guesses
Full Standard: CODITECT-STANDARD-AUTOMATION.md