Load Testing Skill
Load Testing Skill
How to Use This Skill
- Review the patterns and examples below
- Apply the relevant patterns to your implementation
- Follow the best practices outlined in this skill
Production-ready load testing skill covering major tools (k6, Artillery, Locust), test scenario design, threshold configuration, and CI/CD integration for continuous performance validation.
When to Use This Skill
Use load-testing when:
- Validating application performance under expected load
- Stress testing to find breaking points
- Capacity planning for infrastructure scaling
- Performance regression testing in CI/CD
- Simulating realistic user behavior patterns
- Benchmarking API endpoints
Don't use load-testing when:
- Profiling code-level bottlenecks (use
performance-profileragent) - Unit testing functionality (use standard test frameworks)
- Security testing (use
penetration-testingagent) - Only need simple benchmarks (use
wrkorheydirectly)
Load Test Types
| Type | Purpose | Duration | Load Pattern |
|---|---|---|---|
| Load | Validate normal traffic | 10-30 min | Constant VUs |
| Stress | Find breaking point | 30-60 min | Ramping up |
| Spike | Test sudden traffic | 5-15 min | Sharp increase |
| Soak | Find memory leaks | 1-8 hours | Constant extended |
| Breakpoint | Find max capacity | Until failure | Incremental |
Instructions
Phase 1: Test Planning
Objective: Define test scenarios and success criteria.
-
Identify test targets:
Target Endpoints:
- GET /api/users - List users (high traffic)
- POST /api/orders - Create order (critical path)
- GET /api/products/:id - Product detail (cacheable)
Success Criteria:
- P95 latency < 200ms
- Error rate < 1%
- Throughput > 1000 RPS -
Define user scenarios:
Scenario: Browse and Purchase
1. User visits homepage (GET /)
2. Browses products (GET /api/products?page=1)
3. Views product detail (GET /api/products/:id)
4. Adds to cart (POST /api/cart)
5. Checkout (POST /api/orders)
Think time: 1-5 seconds between actions
Distribution: 70% browse, 20% add-to-cart, 10% purchase
Phase 2: k6 Implementation
Objective: Write and run k6 load tests.
-
Basic k6 script:
// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';
// Custom metrics
const errorRate = new Rate('errors');
const responseTime = new Trend('response_time');
// Test configuration
export const options = {
stages: [
{ duration: '2m', target: 50 }, // Ramp up
{ duration: '5m', target: 50 }, // Stay at 50 VUs
{ duration: '2m', target: 100 }, // Ramp to 100
{ duration: '5m', target: 100 }, // Stay at 100
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<200', 'p(99)<500'],
http_req_failed: ['rate<0.01'],
errors: ['rate<0.05'],
},
};
export default function () {
const res = http.get('http://api.example.com/users');
// Check response
const success = check(res, {
'status is 200': (r) => r.status === 200,
'response time < 200ms': (r) => r.timings.duration < 200,
});
// Record metrics
errorRate.add(!success);
responseTime.add(res.timings.duration);
sleep(Math.random() * 3 + 1); // 1-4 second think time
} -
k6 scenario with multiple endpoints:
import http from 'k6/http';
import { check, group, sleep } from 'k6';
const BASE_URL = __ENV.BASE_URL || 'http://localhost:3000';
export const options = {
scenarios: {
browse: {
executor: 'ramping-vus',
startVUs: 0,
stages: [
{ duration: '2m', target: 100 },
{ duration: '5m', target: 100 },
{ duration: '2m', target: 0 },
],
exec: 'browseProducts',
},
purchase: {
executor: 'constant-arrival-rate',
rate: 10,
timeUnit: '1s',
duration: '9m',
preAllocatedVUs: 50,
exec: 'purchaseFlow',
},
},
thresholds: {
'http_req_duration{scenario:browse}': ['p(95)<150'],
'http_req_duration{scenario:purchase}': ['p(95)<300'],
},
};
export function browseProducts() {
group('Browse Flow', () => {
const products = http.get(`${BASE_URL}/api/products`);
check(products, { 'products loaded': (r) => r.status === 200 });
if (products.status === 200) {
const data = JSON.parse(products.body);
if (data.length > 0) {
const productId = data[Math.floor(Math.random() * data.length)].id;
http.get(`${BASE_URL}/api/products/${productId}`);
}
}
});
sleep(2);
}
export function purchaseFlow() {
group('Purchase Flow', () => {
const cart = http.post(`${BASE_URL}/api/cart`, JSON.stringify({
productId: 'prod_123',
quantity: 1,
}), { headers: { 'Content-Type': 'application/json' } });
check(cart, { 'added to cart': (r) => r.status === 201 });
const order = http.post(`${BASE_URL}/api/orders`, JSON.stringify({
cartId: JSON.parse(cart.body).cartId,
}), { headers: { 'Content-Type': 'application/json' } });
check(order, { 'order created': (r) => r.status === 201 });
});
} -
Run k6 test:
# Basic run
k6 run load-test.js
# With environment variables
k6 run -e BASE_URL=https://api.example.com load-test.js
# Output to JSON
k6 run --out json=results.json load-test.js
# Output to InfluxDB for Grafana
k6 run --out influxdb=http://localhost:8086/k6 load-test.js
Phase 3: Artillery Implementation
Objective: Create Artillery test scenarios.
-
Basic Artillery config:
# artillery.yml
config:
target: "http://api.example.com"
phases:
- duration: 120
arrivalRate: 10
name: "Warm up"
- duration: 300
arrivalRate: 50
name: "Sustained load"
- duration: 120
arrivalRate: 100
name: "Peak load"
defaults:
headers:
Content-Type: "application/json"
ensure:
p95: 200
maxErrorRate: 1
scenarios:
- name: "Browse products"
weight: 7
flow:
- get:
url: "/api/products"
capture:
- json: "$[0].id"
as: "productId"
- think: 2
- get:
url: "/api/products/{{ productId }}"
- name: "Create order"
weight: 3
flow:
- post:
url: "/api/cart"
json:
productId: "prod_123"
quantity: 1
capture:
- json: "$.cartId"
as: "cartId"
- think: 1
- post:
url: "/api/orders"
json:
cartId: "{{ cartId }}" -
Run Artillery:
# Run test
artillery run artillery.yml
# Generate report
artillery run --output report.json artillery.yml
artillery report report.json --output report.html
Phase 4: Locust Implementation
Objective: Create Locust test with Python.
-
Locust script:
# locustfile.py
from locust import HttpUser, task, between
import random
class WebsiteUser(HttpUser):
wait_time = between(1, 5)
def on_start(self):
"""Login on start"""
self.client.post("/api/login", json={
"email": "test@example.com",
"password": "password123"
})
@task(10)
def browse_products(self):
"""High frequency: browse products"""
with self.client.get("/api/products", catch_response=True) as response:
if response.status_code == 200:
products = response.json()
if products:
product_id = random.choice(products)["id"]
self.client.get(f"/api/products/{product_id}")
else:
response.failure(f"Got status {response.status_code}")
@task(3)
def add_to_cart(self):
"""Medium frequency: add to cart"""
self.client.post("/api/cart", json={
"productId": "prod_123",
"quantity": random.randint(1, 3)
})
@task(1)
def checkout(self):
"""Low frequency: checkout"""
cart = self.client.post("/api/cart", json={
"productId": "prod_456",
"quantity": 1
})
if cart.status_code == 201:
cart_id = cart.json()["cartId"]
self.client.post("/api/orders", json={"cartId": cart_id})
class AdminUser(HttpUser):
wait_time = between(5, 15)
weight = 1 # 1/10 of WebsiteUser
@task
def view_dashboard(self):
self.client.get("/api/admin/dashboard")
@task
def view_reports(self):
self.client.get("/api/admin/reports") -
Run Locust:
# Web UI mode
locust -f locustfile.py --host=http://api.example.com
# Headless mode
locust -f locustfile.py --host=http://api.example.com \
--headless -u 100 -r 10 -t 5m
# Distributed mode
locust -f locustfile.py --master
locust -f locustfile.py --worker --master-host=192.168.1.100
Phase 5: CI/CD Integration
Objective: Integrate load tests into CI/CD pipeline.
-
GitHub Actions workflow:
# .github/workflows/load-test.yml
name: Load Testing
on:
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * *' # Daily at 2 AM
jobs:
load-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install k6
run: |
curl -s https://dl.k6.io/key.gpg | sudo apt-key add -
echo "deb https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6
- name: Run load test
run: |
k6 run --out json=results.json tests/load/api-test.js
env:
BASE_URL: ${{ secrets.STAGING_URL }}
- name: Check thresholds
run: |
FAILED=$(jq '.metrics.http_req_failed.values.rate' results.json)
if (( $(echo "$FAILED > 0.01" | bc -l) )); then
echo "Error rate too high: $FAILED"
exit 1
fi
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: load-test-results
path: results.json
Threshold Configuration
| Metric | Good | Warning | Critical |
|---|---|---|---|
| P50 Latency | <100ms | <200ms | >200ms |
| P95 Latency | <200ms | <500ms | >500ms |
| P99 Latency | <500ms | <1000ms | >1000ms |
| Error Rate | <0.1% | <1% | >1% |
| Throughput | >1000 RPS | >500 RPS | <500 RPS |
Examples
Example 1: Quick API Benchmark
# Using wrk (simple, fast)
wrk -t12 -c400 -d30s http://api.example.com/health
# Using hey (Go-based, detailed)
hey -n 10000 -c 100 http://api.example.com/api/users
# Using vegeta (rate-controlled)
echo "GET http://api.example.com/api/users" | vegeta attack -duration=30s -rate=100 | vegeta report
Example 2: WebSocket Load Test (k6)
import ws from 'k6/ws';
import { check } from 'k6';
export default function () {
const url = 'ws://echo.websocket.org';
const res = ws.connect(url, {}, function (socket) {
socket.on('open', () => {
socket.send('Hello from k6!');
});
socket.on('message', (data) => {
check(data, { 'message received': (d) => d === 'Hello from k6!' });
socket.close();
});
socket.setTimeout(() => socket.close(), 5000);
});
check(res, { 'connected successfully': (r) => r && r.status === 101 });
}
Troubleshooting
| Issue | Solution |
|---|---|
| Connection refused | Check target server is running and accessible |
| High error rate | Reduce VUs, check server logs |
| Inconsistent results | Increase test duration, use warm-up phase |
| Resource exhaustion | Distribute load across multiple machines |
| SSL errors | Add --insecure-skip-tls-verify or configure certs |
Best Practices
- Define clear success criteria before testing
- Use realistic think times between requests
- Include warm-up and cool-down phases
- Test from multiple geographic locations
- Monitor server-side metrics during tests
- Version control test scripts
- Run tests in dedicated environments
- Store historical results for trending
References
Success Output
When successful, this skill MUST output:
✅ SKILL COMPLETE: load-testing
Completed:
- [x] Test scenarios defined (load/stress/spike/soak/breakpoint)
- [x] Success criteria established (latency, error rate, throughput)
- [x] Load test scripts written (k6/Artillery/Locust)
- [x] Tests executed successfully
- [x] Results analyzed against thresholds
- [x] Performance report generated
Outputs:
- tests/load/ (test scripts: load-test.js, artillery.yml, locustfile.py)
- results/load-test-results.json (k6 JSON output with metrics)
- results/report.html (Artillery HTML report)
- docs/load-test-report.md (analysis: pass/fail, bottlenecks, recommendations)
Metrics:
- P95 latency: X ms (threshold: <200ms)
- Error rate: Y% (threshold: <1%)
- Throughput: Z RPS (threshold: >1000 RPS)
- Test result: PASS/FAIL
Completion Checklist
Before marking this skill as complete, verify:
- Test targets identified (endpoints, scenarios)
- Success criteria defined (latency percentiles, error rates, throughput)
- User scenarios documented (browse, purchase, admin flows)
- Think times configured (realistic user behavior)
- Load test tool selected and configured (k6, Artillery, or Locust)
- Test stages defined (ramp-up, sustained, peak, ramp-down)
- Thresholds configured in test script
- Test executed without errors
- Results exported to JSON/HTML
- Thresholds validated (all passed or failures documented)
- Bottlenecks identified (if any threshold failed)
- Recommendations provided (scaling, optimization, code fixes)
Failure Indicators
This skill has FAILED if:
- ❌ Test script has syntax errors (cannot execute)
- ❌ Connection refused to target server (server not running)
- ❌ Error rate >50% (test setup issue, not server issue)
- ❌ No results exported (test ran but no output captured)
- ❌ Thresholds not configured (no pass/fail criteria)
- ❌ Resource exhaustion on load generator (need distributed mode)
- ❌ Test duration <2 minutes (not long enough for steady state)
When NOT to Use
Do NOT use this skill when:
- Need code-level profiling (use
performance-profileragent instead) - Unit testing functionality (use standard test frameworks)
- Security vulnerability testing (use
penetration-testingagent) - Simple benchmarks sufficient (use
wrkorheydirectly) - No performance requirements defined (establish SLAs first)
- Production environment (use staging/dedicated test environment)
- Single-endpoint latency check (use
curlwith timing)
Use alternative skills:
- performance-profiler - When need CPU/memory profiling at code level
- chaos-engineering - When testing failure scenarios and resilience
- penetration-testing - When security is primary concern
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Testing production directly | Risk of outage, data corruption | Use staging or dedicated load test environment |
| No warm-up phase | Cold start skews results | Include 2-minute ramp-up before sustained load |
| Ignoring think times | Unrealistic traffic pattern | Add 1-5 second sleep between requests |
| Fixed duration too short | Miss memory leaks, gradual degradation | Soak tests: 1-8 hours; others: 10-30 minutes |
| No threshold configuration | No objective pass/fail | Define P95 latency, error rate, throughput thresholds |
| Running from single machine | Resource limits on load generator | Use distributed mode for >500 VUs |
| Inconsistent test data | Caching artifacts skew results | Randomize test data or reset between runs |
Principles
This skill embodies:
- #2 First Principles - Understand expected load before designing test
- #5 Eliminate Ambiguity - Clear thresholds eliminate subjective performance assessment
- #6 Clear, Understandable, Explainable - Metrics and graphs make performance explicit
- #8 No Assumptions - Measure actual performance, don't assume scalability
- #10 Complete Execution - Automated test → analysis → report, no manual steps
Full Standard: CODITECT-STANDARD-AUTOMATION.md
Status: Production-ready Tools: k6, Artillery, Locust, wrk, hey, vegeta Test Types: Load, Stress, Spike, Soak, Breakpoint CI Integration: GitHub Actions, GitLab CI, Jenkins