WF-066: Incident Response Flow
Priority: P0 (Critical) | Phase: Phase 1D - Security & Operations | Effort: 14 hours
Overview
Manages incident lifecycle: status updates, status page communication, user notifications, and SLA tracking.
Trigger: POST /incident/{id}/update | Duration: ~6-10 seconds
SLA Thresholds
| Severity | Resolution SLA | Breach Alert |
|---|---|---|
| Critical | 1 hour | Immediate escalation |
| High | 4 hours | Alert at 3 hours |
| Medium | 24 hours | Alert at 20 hours |
| Low | 72 hours | Alert at 60 hours |
Flow
- Update incident details in database
- Assess impact based on severity
- Update public status page (status.coditect.ai)
- Get affected users (all users for critical, subset for high/medium)
- Email affected users with ETA
- Track SLA compliance (breach detection)
- Publish incident update event
- Return status
Testing
- Incident status updates correctly
- Status page updated in real-time
- Affected users notified
- SLA tracking works
- SLA breach detected correctly
- Event published
Status: ✅ Ready for Implementation