Skip to main content

Scalability and Resilience Test Plan

Goals

  • Validate throughput and latency under load.
  • Prove stability over time with soak tests.
  • Validate recovery from failures of redis, queues, and network.

Load Tests

  • API throughput test with mixed GET/POST endpoints.
  • Event burst test with high fan-out subscriptions.
  • Stream subscription test with high concurrent clients.

Soak Tests

  • 24-hour steady-state run with periodic spikes.
  • Verify memory growth stays within limits and no leaks are observed.

Failure Injection

  • Redis unavailability and recovery during active flows.
  • Event adapter outages and reconnect behavior.
  • Network latency and packet loss simulation.

Success Criteria

  • p95 latency within SLO targets during steady-state load.
  • No data loss or duplicate processing beyond expected semantics.
  • Automatic recovery without manual intervention for transient failures.

Tooling

  • Load generator: k6 or Artillery.
  • Chaos injection: Toxiproxy or tc netem.
  • Metrics collection: Prometheus-compatible exporters.

Reporting

  • Publish reports with throughput, latency, error rate, and resource usage.
  • Capture reproducible scripts for CI integration.