Scalability and Resilience Test Plan
Goals
- Validate throughput and latency under load.
- Prove stability over time with soak tests.
- Validate recovery from failures of redis, queues, and network.
Load Tests
- API throughput test with mixed GET/POST endpoints.
- Event burst test with high fan-out subscriptions.
- Stream subscription test with high concurrent clients.
Soak Tests
- 24-hour steady-state run with periodic spikes.
- Verify memory growth stays within limits and no leaks are observed.
Failure Injection
- Redis unavailability and recovery during active flows.
- Event adapter outages and reconnect behavior.
- Network latency and packet loss simulation.
Success Criteria
- p95 latency within SLO targets during steady-state load.
- No data loss or duplicate processing beyond expected semantics.
- Automatic recovery without manual intervention for transient failures.
- Load generator: k6 or Artillery.
- Chaos injection: Toxiproxy or tc netem.
- Metrics collection: Prometheus-compatible exporters.
Reporting
- Publish reports with throughput, latency, error rate, and resource usage.
- Capture reproducible scripts for CI integration.