/postmortem - Blameless Incident Postmortem

Generate structured, blameless incident postmortems. Guides through timeline construction, 5-Whys root cause analysis, corrective action categorization (detect/prevent/mitigate/process), and produces professional postmortem documents.

System Prompt

Usage

# Basic postmortem
/postmortem --incident "Database outage on 2026-02-01"

# With specific template
/postmortem --incident INC-1234 --template security

# From timeline start
/postmortem --incident "API degradation" --timeline-from "2026-02-01T14:30:00Z"

# Output format selection
/postmortem --incident INC-5678 --output confluence

# Action items only (for tracking)
/postmortem --incident INC-9999 --action-items-only

Options

Option	Description	Default
`--incident`	Incident ID or description	Interactive prompt
`--template`	Template: standard\|abbreviated\|security	standard
`--timeline-from`	Timeline start timestamp (ISO 8601)	Auto-detect
`--output`	Output format: markdown\|confluence\|notion	markdown
`--action-items-only`	Generate only corrective actions (no full postmortem)	false

Related Commands

Success Output

✅ Postmortem Generated
━━━━━━━━━━━━━━━━━━━━━━━━━━

📋 Incident Summary
ID: INC-1234
Title: Database Connection Pool Exhaustion
Date: 2026-02-01
Duration: 2h 15m
Severity: S1 (Critical)
Impact: 100% of API requests failed, ~$45K revenue loss

⏱️ Timeline
14:30 - First alert: API error rate spike (PagerDuty)
14:32 - On-call engineer acknowledged
14:45 - Identified: Database connection pool exhausted
15:00 - Mitigation attempted: Restart application pods
15:15 - Mitigation failed: Pool exhausted again within 5m
15:30 - Root cause found: Connection leak in payment service
15:45 - Fix deployed: Patch connection handling + increase pool size
16:00 - Monitoring: Error rate dropping
16:45 - Resolved: All metrics returned to baseline

🔍 5-Whys Root Cause Analysis
1. Why did API fail? → Database connections exhausted
2. Why were connections exhausted? → Connection pool too small
3. Why was pool too small? → Payment service leaking connections
4. Why was service leaking? → Missing connection.close() in error path
5. Why was leak not detected? → No connection pool monitoring

Root Cause: Missing connection cleanup in error handling path, undetected due to lack of connection pool monitoring.

📝 Corrective Actions

🔍 DETECT (Monitoring & Alerting)
- [ ] Add connection pool utilization metrics (@alice, Due: 2026-02-08)
- [ ] Alert on pool >80% utilization (@bob, Due: 2026-02-08)
- [ ] Dashboard: connection pool health per service (@charlie, Due: 2026-02-15)

🛡️ PREVENT (Safeguards & Design)
- [ ] Code review checklist: verify connection cleanup (@dave, Due: 2026-02-05)
- [ ] Linter rule: flag missing close() in try-catch (@eve, Due: 2026-02-12)
- [ ] Connection pool sizing: calculate based on pod count (@alice, Due: 2026-02-15)

🚨 MITIGATE (Incident Response)
- [ ] Runbook: connection pool exhaustion diagnosis (@bob, Due: 2026-02-10)
- [ ] Auto-scaling: trigger on connection pool metric (@frank, Due: 2026-02-20)

📋 PROCESS (Organizational)
- [ ] Load testing: mandate connection pool stress test (@charlie, Due: 2026-02-12)
- [ ] Postmortem review: share learnings in engineering all-hands (@dave, Due: 2026-02-05)

📄 Document: postmortems/INC-1234-database-connection-pool-2026-02-01.md
🔗 Action Items: Exported to Linear/Jira

/postmortem - Blameless Incident Postmortem

System Prompt

Usage

Options

Success Output

Completion Checklist

Failure Indicators

When NOT to Use

Anti-Patterns

Principles

System Prompt​

Usage​

Options​

Related Commands​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns​

Principles​

System Prompt

Usage

Options

Related Commands

Success Output

Completion Checklist

Failure Indicators

When NOT to Use

Anti-Patterns

Principles