Skip to main content

Load Testing Skill

Load Testing Skill

How to Use This Skill

  1. Review the patterns and examples below
  2. Apply the relevant patterns to your implementation
  3. Follow the best practices outlined in this skill

Production-ready load testing skill covering major tools (k6, Artillery, Locust), test scenario design, threshold configuration, and CI/CD integration for continuous performance validation.

When to Use This Skill

Use load-testing when:

  • Validating application performance under expected load
  • Stress testing to find breaking points
  • Capacity planning for infrastructure scaling
  • Performance regression testing in CI/CD
  • Simulating realistic user behavior patterns
  • Benchmarking API endpoints

Don't use load-testing when:

  • Profiling code-level bottlenecks (use performance-profiler agent)
  • Unit testing functionality (use standard test frameworks)
  • Security testing (use penetration-testing agent)
  • Only need simple benchmarks (use wrk or hey directly)

Load Test Types

TypePurposeDurationLoad Pattern
LoadValidate normal traffic10-30 minConstant VUs
StressFind breaking point30-60 minRamping up
SpikeTest sudden traffic5-15 minSharp increase
SoakFind memory leaks1-8 hoursConstant extended
BreakpointFind max capacityUntil failureIncremental

Instructions

Phase 1: Test Planning

Objective: Define test scenarios and success criteria.

  1. Identify test targets:

    Target Endpoints:
    - GET /api/users - List users (high traffic)
    - POST /api/orders - Create order (critical path)
    - GET /api/products/:id - Product detail (cacheable)

    Success Criteria:
    - P95 latency < 200ms
    - Error rate < 1%
    - Throughput > 1000 RPS
  2. Define user scenarios:

    Scenario: Browse and Purchase
    1. User visits homepage (GET /)
    2. Browses products (GET /api/products?page=1)
    3. Views product detail (GET /api/products/:id)
    4. Adds to cart (POST /api/cart)
    5. Checkout (POST /api/orders)

    Think time: 1-5 seconds between actions
    Distribution: 70% browse, 20% add-to-cart, 10% purchase

Phase 2: k6 Implementation

Objective: Write and run k6 load tests.

  1. Basic k6 script:

    // load-test.js
    import http from 'k6/http';
    import { check, sleep } from 'k6';
    import { Rate, Trend } from 'k6/metrics';

    // Custom metrics
    const errorRate = new Rate('errors');
    const responseTime = new Trend('response_time');

    // Test configuration
    export const options = {
    stages: [
    { duration: '2m', target: 50 }, // Ramp up
    { duration: '5m', target: 50 }, // Stay at 50 VUs
    { duration: '2m', target: 100 }, // Ramp to 100
    { duration: '5m', target: 100 }, // Stay at 100
    { duration: '2m', target: 0 }, // Ramp down
    ],
    thresholds: {
    http_req_duration: ['p(95)<200', 'p(99)<500'],
    http_req_failed: ['rate<0.01'],
    errors: ['rate<0.05'],
    },
    };

    export default function () {
    const res = http.get('http://api.example.com/users');

    // Check response
    const success = check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 200ms': (r) => r.timings.duration < 200,
    });

    // Record metrics
    errorRate.add(!success);
    responseTime.add(res.timings.duration);

    sleep(Math.random() * 3 + 1); // 1-4 second think time
    }
  2. k6 scenario with multiple endpoints:

    import http from 'k6/http';
    import { check, group, sleep } from 'k6';

    const BASE_URL = __ENV.BASE_URL || 'http://localhost:3000';

    export const options = {
    scenarios: {
    browse: {
    executor: 'ramping-vus',
    startVUs: 0,
    stages: [
    { duration: '2m', target: 100 },
    { duration: '5m', target: 100 },
    { duration: '2m', target: 0 },
    ],
    exec: 'browseProducts',
    },
    purchase: {
    executor: 'constant-arrival-rate',
    rate: 10,
    timeUnit: '1s',
    duration: '9m',
    preAllocatedVUs: 50,
    exec: 'purchaseFlow',
    },
    },
    thresholds: {
    'http_req_duration{scenario:browse}': ['p(95)<150'],
    'http_req_duration{scenario:purchase}': ['p(95)<300'],
    },
    };

    export function browseProducts() {
    group('Browse Flow', () => {
    const products = http.get(`${BASE_URL}/api/products`);
    check(products, { 'products loaded': (r) => r.status === 200 });

    if (products.status === 200) {
    const data = JSON.parse(products.body);
    if (data.length > 0) {
    const productId = data[Math.floor(Math.random() * data.length)].id;
    http.get(`${BASE_URL}/api/products/${productId}`);
    }
    }
    });
    sleep(2);
    }

    export function purchaseFlow() {
    group('Purchase Flow', () => {
    const cart = http.post(`${BASE_URL}/api/cart`, JSON.stringify({
    productId: 'prod_123',
    quantity: 1,
    }), { headers: { 'Content-Type': 'application/json' } });

    check(cart, { 'added to cart': (r) => r.status === 201 });

    const order = http.post(`${BASE_URL}/api/orders`, JSON.stringify({
    cartId: JSON.parse(cart.body).cartId,
    }), { headers: { 'Content-Type': 'application/json' } });

    check(order, { 'order created': (r) => r.status === 201 });
    });
    }
  3. Run k6 test:

    # Basic run
    k6 run load-test.js

    # With environment variables
    k6 run -e BASE_URL=https://api.example.com load-test.js

    # Output to JSON
    k6 run --out json=results.json load-test.js

    # Output to InfluxDB for Grafana
    k6 run --out influxdb=http://localhost:8086/k6 load-test.js

Phase 3: Artillery Implementation

Objective: Create Artillery test scenarios.

  1. Basic Artillery config:

    # artillery.yml
    config:
    target: "http://api.example.com"
    phases:
    - duration: 120
    arrivalRate: 10
    name: "Warm up"
    - duration: 300
    arrivalRate: 50
    name: "Sustained load"
    - duration: 120
    arrivalRate: 100
    name: "Peak load"

    defaults:
    headers:
    Content-Type: "application/json"

    ensure:
    p95: 200
    maxErrorRate: 1

    scenarios:
    - name: "Browse products"
    weight: 7
    flow:
    - get:
    url: "/api/products"
    capture:
    - json: "$[0].id"
    as: "productId"
    - think: 2
    - get:
    url: "/api/products/{{ productId }}"

    - name: "Create order"
    weight: 3
    flow:
    - post:
    url: "/api/cart"
    json:
    productId: "prod_123"
    quantity: 1
    capture:
    - json: "$.cartId"
    as: "cartId"
    - think: 1
    - post:
    url: "/api/orders"
    json:
    cartId: "{{ cartId }}"
  2. Run Artillery:

    # Run test
    artillery run artillery.yml

    # Generate report
    artillery run --output report.json artillery.yml
    artillery report report.json --output report.html

Phase 4: Locust Implementation

Objective: Create Locust test with Python.

  1. Locust script:

    # locustfile.py
    from locust import HttpUser, task, between
    import random

    class WebsiteUser(HttpUser):
    wait_time = between(1, 5)

    def on_start(self):
    """Login on start"""
    self.client.post("/api/login", json={
    "email": "test@example.com",
    "password": "password123"
    })

    @task(10)
    def browse_products(self):
    """High frequency: browse products"""
    with self.client.get("/api/products", catch_response=True) as response:
    if response.status_code == 200:
    products = response.json()
    if products:
    product_id = random.choice(products)["id"]
    self.client.get(f"/api/products/{product_id}")
    else:
    response.failure(f"Got status {response.status_code}")

    @task(3)
    def add_to_cart(self):
    """Medium frequency: add to cart"""
    self.client.post("/api/cart", json={
    "productId": "prod_123",
    "quantity": random.randint(1, 3)
    })

    @task(1)
    def checkout(self):
    """Low frequency: checkout"""
    cart = self.client.post("/api/cart", json={
    "productId": "prod_456",
    "quantity": 1
    })
    if cart.status_code == 201:
    cart_id = cart.json()["cartId"]
    self.client.post("/api/orders", json={"cartId": cart_id})

    class AdminUser(HttpUser):
    wait_time = between(5, 15)
    weight = 1 # 1/10 of WebsiteUser

    @task
    def view_dashboard(self):
    self.client.get("/api/admin/dashboard")

    @task
    def view_reports(self):
    self.client.get("/api/admin/reports")
  2. Run Locust:

    # Web UI mode
    locust -f locustfile.py --host=http://api.example.com

    # Headless mode
    locust -f locustfile.py --host=http://api.example.com \
    --headless -u 100 -r 10 -t 5m

    # Distributed mode
    locust -f locustfile.py --master
    locust -f locustfile.py --worker --master-host=192.168.1.100

Phase 5: CI/CD Integration

Objective: Integrate load tests into CI/CD pipeline.

  1. GitHub Actions workflow:

    # .github/workflows/load-test.yml
    name: Load Testing

    on:
    pull_request:
    branches: [main]
    schedule:
    - cron: '0 2 * * *' # Daily at 2 AM

    jobs:
    load-test:
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v4

    - name: Install k6
    run: |
    curl -s https://dl.k6.io/key.gpg | sudo apt-key add -
    echo "deb https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
    sudo apt-get update
    sudo apt-get install k6

    - name: Run load test
    run: |
    k6 run --out json=results.json tests/load/api-test.js
    env:
    BASE_URL: ${{ secrets.STAGING_URL }}

    - name: Check thresholds
    run: |
    FAILED=$(jq '.metrics.http_req_failed.values.rate' results.json)
    if (( $(echo "$FAILED > 0.01" | bc -l) )); then
    echo "Error rate too high: $FAILED"
    exit 1
    fi

    - name: Upload results
    uses: actions/upload-artifact@v4
    with:
    name: load-test-results
    path: results.json

Threshold Configuration

MetricGoodWarningCritical
P50 Latency<100ms<200ms>200ms
P95 Latency<200ms<500ms>500ms
P99 Latency<500ms<1000ms>1000ms
Error Rate<0.1%<1%>1%
Throughput>1000 RPS>500 RPS<500 RPS

Examples

Example 1: Quick API Benchmark

# Using wrk (simple, fast)
wrk -t12 -c400 -d30s http://api.example.com/health

# Using hey (Go-based, detailed)
hey -n 10000 -c 100 http://api.example.com/api/users

# Using vegeta (rate-controlled)
echo "GET http://api.example.com/api/users" | vegeta attack -duration=30s -rate=100 | vegeta report

Example 2: WebSocket Load Test (k6)

import ws from 'k6/ws';
import { check } from 'k6';

export default function () {
const url = 'ws://echo.websocket.org';
const res = ws.connect(url, {}, function (socket) {
socket.on('open', () => {
socket.send('Hello from k6!');
});

socket.on('message', (data) => {
check(data, { 'message received': (d) => d === 'Hello from k6!' });
socket.close();
});

socket.setTimeout(() => socket.close(), 5000);
});

check(res, { 'connected successfully': (r) => r && r.status === 101 });
}

Troubleshooting

IssueSolution
Connection refusedCheck target server is running and accessible
High error rateReduce VUs, check server logs
Inconsistent resultsIncrease test duration, use warm-up phase
Resource exhaustionDistribute load across multiple machines
SSL errorsAdd --insecure-skip-tls-verify or configure certs

Best Practices

  • Define clear success criteria before testing
  • Use realistic think times between requests
  • Include warm-up and cool-down phases
  • Test from multiple geographic locations
  • Monitor server-side metrics during tests
  • Version control test scripts
  • Run tests in dedicated environments
  • Store historical results for trending

References

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: load-testing

Completed:
- [x] Test scenarios defined (load/stress/spike/soak/breakpoint)
- [x] Success criteria established (latency, error rate, throughput)
- [x] Load test scripts written (k6/Artillery/Locust)
- [x] Tests executed successfully
- [x] Results analyzed against thresholds
- [x] Performance report generated

Outputs:
- tests/load/ (test scripts: load-test.js, artillery.yml, locustfile.py)
- results/load-test-results.json (k6 JSON output with metrics)
- results/report.html (Artillery HTML report)
- docs/load-test-report.md (analysis: pass/fail, bottlenecks, recommendations)

Metrics:
- P95 latency: X ms (threshold: <200ms)
- Error rate: Y% (threshold: <1%)
- Throughput: Z RPS (threshold: >1000 RPS)
- Test result: PASS/FAIL

Completion Checklist

Before marking this skill as complete, verify:

  • Test targets identified (endpoints, scenarios)
  • Success criteria defined (latency percentiles, error rates, throughput)
  • User scenarios documented (browse, purchase, admin flows)
  • Think times configured (realistic user behavior)
  • Load test tool selected and configured (k6, Artillery, or Locust)
  • Test stages defined (ramp-up, sustained, peak, ramp-down)
  • Thresholds configured in test script
  • Test executed without errors
  • Results exported to JSON/HTML
  • Thresholds validated (all passed or failures documented)
  • Bottlenecks identified (if any threshold failed)
  • Recommendations provided (scaling, optimization, code fixes)

Failure Indicators

This skill has FAILED if:

  • ❌ Test script has syntax errors (cannot execute)
  • ❌ Connection refused to target server (server not running)
  • ❌ Error rate >50% (test setup issue, not server issue)
  • ❌ No results exported (test ran but no output captured)
  • ❌ Thresholds not configured (no pass/fail criteria)
  • ❌ Resource exhaustion on load generator (need distributed mode)
  • ❌ Test duration <2 minutes (not long enough for steady state)

When NOT to Use

Do NOT use this skill when:

  • Need code-level profiling (use performance-profiler agent instead)
  • Unit testing functionality (use standard test frameworks)
  • Security vulnerability testing (use penetration-testing agent)
  • Simple benchmarks sufficient (use wrk or hey directly)
  • No performance requirements defined (establish SLAs first)
  • Production environment (use staging/dedicated test environment)
  • Single-endpoint latency check (use curl with timing)

Use alternative skills:

  • performance-profiler - When need CPU/memory profiling at code level
  • chaos-engineering - When testing failure scenarios and resilience
  • penetration-testing - When security is primary concern

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Testing production directlyRisk of outage, data corruptionUse staging or dedicated load test environment
No warm-up phaseCold start skews resultsInclude 2-minute ramp-up before sustained load
Ignoring think timesUnrealistic traffic patternAdd 1-5 second sleep between requests
Fixed duration too shortMiss memory leaks, gradual degradationSoak tests: 1-8 hours; others: 10-30 minutes
No threshold configurationNo objective pass/failDefine P95 latency, error rate, throughput thresholds
Running from single machineResource limits on load generatorUse distributed mode for >500 VUs
Inconsistent test dataCaching artifacts skew resultsRandomize test data or reset between runs

Principles

This skill embodies:

  • #2 First Principles - Understand expected load before designing test
  • #5 Eliminate Ambiguity - Clear thresholds eliminate subjective performance assessment
  • #6 Clear, Understandable, Explainable - Metrics and graphs make performance explicit
  • #8 No Assumptions - Measure actual performance, don't assume scalability
  • #10 Complete Execution - Automated test → analysis → report, no manual steps

Full Standard: CODITECT-STANDARD-AUTOMATION.md


Status: Production-ready Tools: k6, Artillery, Locust, wrk, hey, vegeta Test Types: Load, Stress, Spike, Soak, Breakpoint CI Integration: GitHub Actions, GitLab CI, Jenkins