Work Order QMS Module — Testing & Validation Strategy
Classification: Internal — Quality Engineering Date: 2026-02-13 Artifact: 65 of WO System Series Prompt Section: v8.0 §9 — Testing & Validation Strategy
1. Test Pyramid
1.1 Pyramid Definition
╱╲
╱ ╲ Compliance Validation (IQ/OQ/PQ)
╱ 3% ╲ Full regulatory workflow evidence
╱──────╲
╱ ╲ E2E Tests
╱ 5% ╲ Multi-container, browser-driven
╱────────────╲
╱ ╲ Contract Tests
╱ 8% ╲ API contracts, message schemas
╱──────────────────╲
╱ ╲ Integration Tests
╱ 19% ╲ Database, NATS, Vault, external
╱────────────────────────╲
╱ ╲ Unit Tests
╱ 65% ╲ Pure logic, deterministic, fast
╱──────────────────────────────╲
1.2 Layer Details
| Layer | Scope | Target Count | Speed | Run When | Tooling | Compliance Role |
|---|---|---|---|---|---|---|
| Unit | Single function, class, or module | ~650 | <10ms each | Every commit (pre-push hook) | Vitest (TS), pytest (Python) | Logic correctness evidence |
| Integration | Component + real dependency | ~190 | <2s each | Every PR | Vitest + Testcontainers, pytest + Testcontainers | Data integrity, query correctness |
| Contract | API endpoint shape, message schema | ~80 | <500ms each | Every PR | Pact (consumer-driven), JSON Schema validation | Interface compliance |
| E2E | Full workflow, multi-container | ~50 | <30s each | Pre-merge + nightly | Playwright (browser), supertest (API) | Workflow correctness |
| Compliance | Regulatory evidence generation | ~30 | <60s each | Pre-release + quarterly | Custom validation harness | IQ/OQ/PQ evidence |
| Total | ~1,000 |
1.3 Unit Test Coverage Targets
| Component | Coverage Target | Critical Paths (100% required) |
|---|---|---|
| State machine guards (T1–T8) | 100% | All guard functions, all transition paths |
| Model Router | 95% | All routing rules, all fallback paths |
| RBAC permission checks | 100% | All role × permission combinations |
| SOD enforcement | 100% | All conflict detection rules |
| DAG cycle detection | 95% | Kahn's algorithm, all edge cases |
| Optimistic locking | 95% | Version check, conflict detection, retry logic |
| Audit trail generation | 100% | All entity types × all action types |
| Hash chain computation | 100% | Hash generation, chain verification, break detection |
| Token budget controller | 95% | Budget allocation, threshold enforcement, hard stop |
| Circuit breaker | 95% | State transitions (closed → open → half-open), recovery |
1.4 Integration Test Scope
| Test Category | What's Tested | Real Dependencies | Mock Dependencies |
|---|---|---|---|
| Database operations | CRUD, RLS enforcement, triggers, migrations | PostgreSQL (Testcontainers) | None |
| Event bus | Publish, subscribe, ordering, backpressure | NATS (Testcontainers) | None |
| Audit trail immutability | Trigger blocks UPDATE/DELETE | PostgreSQL (Testcontainers) | None |
| Cross-tenant isolation | RLS prevents cross-tenant reads | PostgreSQL (Testcontainers) | None |
| Agent message contracts | Message serialization, validation, routing | NATS (Testcontainers) | AI model (deterministic stub) |
| Vault integration | Secret retrieval, rotation | Vault dev mode (Testcontainers) | None |
| API endpoint behavior | Request → response, auth, rate limiting | Express (in-process) | Database (seeded Testcontainer) |
1.5 Contract Tests
| Contract | Provider | Consumer | Schema Source |
|---|---|---|---|
| WO REST API | WO Service | Frontend, Agent Workers | OpenAPI 3.1 spec |
| Agent messages | Agent Workers | Orchestrator, Compliance Engine | TypeScript interfaces (26-agent-message-contracts.md) |
| Audit trail events | WO Service | Compliance Engine, SIEM connector | Event schema (JSON Schema) |
| Webhook payloads | WO Service | External subscribers | Webhook schema (JSON Schema) |
| Approval/signature flow | Signature Service | WO Service, Frontend | Signature API contract |
Contract test verification: provider publishes contract → consumer tests against contract → breaking changes detected before merge.
2. Test Data Management
2.1 Synthetic Data Strategy
Production data is never used in non-production environments. All test data is synthetic.
| Data Category | Generation Strategy | Regulatory Constraint | Tooling |
|---|---|---|---|
| Person records | Faker-generated names, emails, phone numbers | Must not match any real individual; must pass format validation | @faker-js/faker with deterministic seed |
| Work orders | Template-based generation covering all WO types, statuses, and regulatory flags | Must cover every state machine transition path | Custom seed script (seed-work-orders.ts) |
| Approval chains | Combinatorial generation of all role × decision paths | Must include SOD-compliant and SOD-violating scenarios | Custom generator with constraint solver |
| Audit trails | Generated from WO lifecycle execution (not fabricated independently) | Must be internally consistent — audit entries match WO transitions | Generated as side-effect of WO test execution |
| Asset/tool catalog | Realistic bioscience equipment names and categories | No patient/subject identifiers | Static fixtures in test/fixtures/assets.json |
| Multi-tenant data | Identical schema, different tenant_id values | Must test RLS isolation between tenants | Seed script creates 3 test tenants |
| Edge cases | Property-based generation (boundary values, null fields, max-length strings) | Must test validation boundaries | fast-check property-based testing |
2.2 Seed Data Structure
test/
├── fixtures/
│ ├── assets.json # 50 bioscience assets
│ ├── tools.json # 30 tools with calibration data
│ ├── experiences.json # 20 experience/certification types
│ ├── materials.json # 15 material types
│ └── persons.json # 25 persons across all roles
├── factories/
│ ├── work-order.factory.ts # Creates WOs with configurable complexity
│ ├── approval.factory.ts # Creates approval chains (valid and invalid)
│ ├── job-plan.factory.ts # Creates job plans with requirements
│ └── tenant.factory.ts # Creates isolated tenant contexts
├── scenarios/
│ ├── happy-path.scenario.ts # WO: draft → completed → closed
│ ├── regulatory.scenario.ts # Full Part 11 workflow with signatures
│ ├── master-linked.scenario.ts # Master WO with 5 linked WOs + dependencies
│ ├── vendor.scenario.ts # Vendor assignment → execution → evidence
│ ├── rejection.scenario.ts # WO rejected → revision → re-approval
│ ├── cancellation.scenario.ts # WO cancelled at various stages
│ └── concurrent.scenario.ts # Optimistic locking conflict resolution
└── seeds/
├── seed-all.ts # Master seed script (deterministic)
├── seed-minimal.ts # Minimum viable data for unit tests
└── seed-performance.ts # 10,000 WOs for load testing
2.3 Data Isolation Rules
| Rule | Implementation | Verification |
|---|---|---|
| No production data in test environments | Network isolation (test env cannot reach production DB) | Monthly audit of test DB contents |
| Deterministic seed data | All factories use seeded PRNG (faker.seed(42)) | Same seed → same data (verified in CI) |
| PHI-free certification | PHI scanner runs on test fixtures before commit | CI gate blocks commits with PHI patterns |
| Tenant isolation in tests | Each test suite creates its own tenant context | RLS verification test: cross-tenant query returns 0 rows |
| Test data cleanup | Transactional rollback per test (unit/integration); DB reset per suite (E2E) | Post-suite assertion: no orphaned test data |
3. Performance Testing
3.1 Performance Budgets
| Operation | P50 Target | P95 Target | P99 Target | Measurement |
|---|---|---|---|---|
| Create WO (single) | <50ms | <200ms | <500ms | API response time |
| State transition | <100ms | <300ms | <1s | API response time (excludes approval wait) |
| List WOs (paginated, 50/page) | <100ms | <300ms | <1s | API response time |
| Dependency graph query | <50ms | <150ms | <500ms | API response time |
| Critical path calculation | <200ms | <500ms | <2s | API response time |
| Audit trail query (paginated) | <100ms | <300ms | <1s | API response time |
| E-signature creation | <200ms | <500ms | <1s | API response time (includes hash computation) |
| Agent dispatch (task → first action) | <500ms | <1s | <3s | Orchestrator metric |
| Batch WO creation (100 WOs) | <2s | <5s | <10s | API response time |
| Dashboard render (50 WOs) | <1s | <2s | <5s | Frontend time-to-interactive |
3.2 Load Profiles
| Profile | Concurrent Users | WO Volume | Duration | Purpose |
|---|---|---|---|---|
| Baseline | 25 | 500 WOs pre-seeded, 50 new/hr | 1 hour | Establish performance baseline |
| Normal load | 100 | 5,000 WOs pre-seeded, 200 new/hr | 4 hours | Typical enterprise usage |
| Peak load | 500 | 10,000 WOs pre-seeded, 1,000 new/hr | 2 hours | Quarter-end audit preparation |
| Stress | 2,000 | 50,000 WOs pre-seeded, 5,000 new/hr | 30 min | Find breaking point |
| Soak | 100 (sustained) | Continuous creation + transition | 24 hours | Memory leak detection |
| Spike | 0 → 1,000 → 0 (in 60s) | Burst of 10,000 transitions | 10 min | Auto-scaling validation |
3.3 Performance Test Implementation
// k6 load test: WO lifecycle
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp to 100 users
{ duration: '5m', target: 100 }, // Hold at 100
{ duration: '2m', target: 500 }, // Ramp to peak
{ duration: '5m', target: 500 }, // Hold at peak
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
'http_req_duration{endpoint:create_wo}': ['p(95)<200'],
'http_req_duration{endpoint:transition}': ['p(95)<300'],
'http_req_duration{endpoint:list_wos}': ['p(95)<300'],
http_req_failed: ['rate<0.01'], // <1% error rate
},
};
export default function () {
// Create WO
const createRes = http.post(`${BASE_URL}/api/v1/work-orders`, JSON.stringify({
summary: `Perf test WO ${Date.now()}`,
type: 'AUTOMATION',
regulatory: false,
systemCategoryId: 'test-category',
}), { headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${TOKEN}` },
tags: { endpoint: 'create_wo' } });
check(createRes, { 'WO created': (r) => r.status === 201 });
const woId = createRes.json('id');
// Transition: DRAFT → PLANNED
const transRes = http.patch(`${BASE_URL}/api/v1/work-orders/${woId}/status`,
JSON.stringify({ status: 'PLANNED', reason: 'Performance test' }),
{ headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${TOKEN}` },
tags: { endpoint: 'transition' } });
check(transRes, { 'Transitioned': (r) => r.status === 200 });
sleep(1);
}
3.4 Performance Test Schedule
| Test Type | Frequency | Environment | Gate | Owner |
|---|---|---|---|---|
| Baseline benchmark | Every release | Staging | Regression > 10% blocks release | Engineering |
| Normal load | Weekly (automated) | Staging | P95 violations create P3 ticket | SRE |
| Stress test | Monthly | Dedicated perf environment | Document capacity ceiling, update scaling guidelines | SRE |
| Soak test | Quarterly | Dedicated perf environment | Memory growth > 5%/hr blocks release | SRE |
| Spike test | Per release (if auto-scaling changed) | Staging | Recovery within 30s | SRE |
4. Chaos Engineering
4.1 Experiment Catalog
| Experiment | Method | Expected Behavior | Recovery Target | Compliance Impact |
|---|---|---|---|---|
| Kill Agent Worker | Terminate container (random agent) | Circuit breaker opens, task re-routed to healthy worker | < 30s | No audit trail gaps; in-flight WO transitions preserved |
| PostgreSQL primary failure | Force replica promotion | Reads continue immediately; writes resume after promotion | < 60s | Zero data loss (synchronous replication for L4); audit trail intact |
| NATS partition | Network partition between NATS nodes | Buffered delivery, no message loss; publishers see backpressure | < 120s | Audit events delayed but not lost; compliance evidence intact |
| Vault unavailable | Block vault endpoint | Cached credentials used (30min TTL); alert fired; new secret requests fail gracefully | < 10s (cache), < 5min (fresh) | Agent executions paused if fresh credentials needed |
| AI model API timeout | Inject 30s latency on model endpoint | Fallback to secondary model; timeout after 10s; circuit breaker opens | < 5s | Agent switches model tier; task continues with different model |
| Disk full on State Store | Fill ephemeral storage to 95% | Alert fires; oldest temp files purged; WAL archiving prioritized | < 30s | No data loss; new writes may temporarily fail |
| API Gateway crash | Kill API Gateway pod | Kubernetes restarts pod; load balancer routes to healthy instances | < 15s | Brief API unavailability; no data loss |
| DNS resolution failure | Drop DNS for model provider | Cached DNS used; circuit breaker opens for model calls; queued for retry | < 5s | Agent execution paused; human checkpoint triggered |
| Clock skew | Inject 5-minute clock offset on one node | Timestamps from skewed node detected via NTP monitoring; affected audit entries flagged | Detection < 60s | Compliance alert: server-side timestamps (§11.10(e)) potentially affected |
| Tenant isolation breach attempt | Inject cross-tenant RLS bypass attempt | Query returns 0 rows; security alert triggered; attacker session terminated | Immediate | P1 incident; forensic investigation initiated |
4.2 Chaos Test Schedule
| Cadence | Experiments | Environment | Stakeholder Notification |
|---|---|---|---|
| Weekly (automated) | Kill Agent Worker, API Gateway crash | Staging | SRE team (automated report) |
| Monthly | All single-component failures | Staging | Engineering + SRE review |
| Quarterly | Multi-component failures (e.g., DB + NATS simultaneously) | Dedicated chaos environment | Engineering + SRE + compliance review |
| Annually | Full DR exercise (see operational-readiness.md) | Production replica | All stakeholders + regulatory |
4.3 Chaos Engineering Maturity Model
| Level | Description | WO System Status |
|---|---|---|
| L0: No chaos | Only reactive incident response | Passed ✓ |
| L1: GameDay | Manual fault injection with team present | Current ← |
| L2: Automated | Scheduled automated fault injection in staging | Target (Q3 2026) |
| L3: Continuous | Continuous fault injection in production with auto-rollback | Future (Q1 2027) |
| L4: Self-healing | System detects and corrects faults autonomously | Future |
5. Compliance Validation Automation (IQ/OQ/PQ)
5.1 Qualification Overview
| Qualification | Purpose | Automation Level | Evidence Output | Trigger |
|---|---|---|---|---|
| IQ (Installation) | Verify correct installation of WO system | 100% automated | Deployment manifest, config verification, dependency check | Every deployment |
| OQ (Operational) | Verify correct operation under normal conditions | 95% automated | Test execution report, expected vs. actual, screenshot evidence | Every release |
| PQ (Performance) | Verify correct operation under real-world conditions | 80% automated | Performance report, SLA compliance, capacity analysis | Quarterly + major release |
5.2 IQ (Installation Qualification) Test Cases
| IQ Test | Verification | Pass Criteria | Evidence |
|---|---|---|---|
| IQ-001: Database schema | Compare deployed schema hash against expected | Hash match | Schema diff report (empty = pass) |
| IQ-002: RLS policies | Verify all 22 tables have active RLS | All policies active | pg_policies query result |
| IQ-003: Audit trail triggers | Verify trigger prevents UPDATE/DELETE on audit_trail | Trigger exists and active | Trigger test execution log |
| IQ-004: Service connectivity | Verify all containers can reach dependencies | All health checks pass | Health check response log |
| IQ-005: TLS configuration | Verify TLS 1.3 on all endpoints | No TLS < 1.3 accepted | SSL scan report (testssl.sh) |
| IQ-006: Vault connectivity | Verify service can retrieve test secret from vault | Secret retrieved successfully | Vault audit log entry |
| IQ-007: NATS connectivity | Verify publish/subscribe on test channel | Message round-trip < 100ms | NATS monitoring metrics |
| IQ-008: Configuration verification | Compare deployed config against approved config | Config hash match | Config diff report |
| IQ-009: Container image verification | Verify Cosign signature on all deployed images | All signatures valid | Sigstore verification log |
| IQ-010: Version verification | Verify deployed version matches release manifest | Version match | Version endpoint response |
5.3 OQ (Operational Qualification) Test Cases
| OQ Test | Scenario | Pass Criteria | Evidence |
|---|---|---|---|
| OQ-001: WO lifecycle (happy path) | Create → Plan → Schedule → Execute → Review → Approve → Complete → Close | All transitions succeed; audit trail complete | Full audit trail export |
| OQ-002: Regulatory WO with dual approval | Create regulatory WO → SO approval → QA approval → completion | Both signatures captured with meaning; hash binding valid | Signature records + hash verification |
| OQ-003: SOD enforcement | Assignee attempts to approve own WO | Transition blocked; guard violation logged | Guard violation audit entry |
| OQ-004: Master/Linked WO hierarchy | Create master + 5 linked WOs with dependencies | DAG valid; critical path calculated; progress tracking accurate | Dependency graph export + progress report |
| OQ-005: Cross-tenant isolation | Tenant A queries for Tenant B's WOs | 0 results returned; access logged | Query result + audit log |
| OQ-006: Optimistic locking conflict | Two users update same WO simultaneously | One succeeds, one gets 409 Conflict with diff | API response logs |
| OQ-007: Agent WO creation | Agent creates and transitions WO within scope | WO created; agent attribution in audit trail; scope enforced | Audit trail with agent session ID |
| OQ-008: Vendor portal scoping | Vendor queries WOs outside assignment | 403 Forbidden; access logged | API response + audit log |
| OQ-009: Audit trail immutability | Attempt UPDATE on audit_trail via direct SQL | UPDATE blocked by trigger; error logged | Database error log |
| OQ-010: E-signature re-authentication | User signs after session timeout | Re-auth required; new attestation created; signature valid | Re-auth attestation + signature record |
| OQ-011: WO cancellation | Cancel WO at each stage | Correct transitions; proper audit trail; linked WOs notified | Audit trail + notification log |
| OQ-012: Resource matching | Create WO with experience requirements → agent matches person | Qualified person identified; assignment recommendation valid | Match result + audit log |
5.4 PQ (Performance Qualification) Test Cases
| PQ Test | Scenario | Pass Criteria | Evidence |
|---|---|---|---|
| PQ-001: Normal load performance | 100 concurrent users, 200 WOs/hr for 4 hours | All P95 targets met; error rate < 0.1% | k6 report + Grafana dashboard snapshot |
| PQ-002: Peak load performance | 500 concurrent users, 1,000 WOs/hr for 2 hours | P95 < 500ms; error rate < 1%; auto-scaling triggers | k6 report + scaling event log |
| PQ-003: Audit trail volume | 100,000 audit entries; query performance | Paginated query P95 < 300ms | Query execution plan + timing |
| PQ-004: Concurrent approvals | 50 simultaneous approval signing events | All signatures valid; no hash collision; no race condition | Signature verification report |
| PQ-005: Agent execution under load | 20 concurrent agent WO executions | Token budget respected; circuit breakers functional; no agent starvation | Agent monitoring dashboard snapshot |
5.5 Evidence Package Generation
Every qualification run generates a standardized evidence package:
evidence/
├── iq/
│ ├── IQ-execution-report.json # Machine-readable results
│ ├── IQ-execution-report.pdf # Human-readable with screenshots
│ ├── IQ-traceability-matrix.csv # Requirement → test case → result
│ └── IQ-signature-page.pdf # E-signed by QA reviewer
├── oq/
│ ├── OQ-execution-report.json
│ ├── OQ-execution-report.pdf
│ ├── OQ-traceability-matrix.csv
│ ├── OQ-screenshots/ # UI workflow evidence
│ └── OQ-signature-page.pdf
├── pq/
│ ├── PQ-execution-report.json
│ ├── PQ-execution-report.pdf
│ ├── PQ-grafana-snapshots/ # Performance dashboard captures
│ ├── PQ-k6-reports/ # Load test detailed results
│ └── PQ-signature-page.pdf
└── summary/
├── qualification-summary.pdf # Executive summary of all qualifications
├── deviation-report.pdf # Any failures and disposition
└── release-authorization.pdf # Final sign-off for deployment
6. CI/CD Integration
6.1 Pipeline Gates
Developer Commit
│
▼
Pre-Push Hook (local)
├── Lint (ESLint/Ruff)
├── Type check (tsc --noEmit)
└── Unit tests (affected files only)
│
▼
Pull Request (CI)
├── Full unit test suite (~650 tests, <2min)
├── Integration tests (~190 tests, <5min)
├── Contract tests (~80 tests, <1min)
├── Security scan (Snyk/Trivy, block on critical)
├── License check (FOSSA)
├── PHI scan on test data (block if detected)
└── Coverage gate (overall ≥85%, critical paths = 100%)
│
▼ (all pass → merge allowed)
│
Main Branch (CI)
├── Full test suite (unit + integration + contract)
├── E2E tests (~50 tests, <15min)
├── Container image build + Cosign signing
├── SBOM generation (CycloneDX)
└── Performance baseline (quick benchmark, 5min)
│
▼ (all pass → deploy to staging)
│
Staging Deployment
├── IQ automation (full installation qualification)
├── OQ automation (operational qualification)
├── Smoke tests (critical path only, <2min)
└── Weekly: chaos experiments (automated)
│
▼ (all pass + QA sign-off → production eligible)
│
Production Deployment
├── Blue-green deployment
├── IQ automation (production installation verification)
├── Smoke tests (critical path)
├── Canary metrics monitoring (15min)
└── Rollback trigger: error rate >1% or P95 >2s
6.2 Test Failure Response
| Failure Type | Automated Response | Human Response | SLA |
|---|---|---|---|
| Unit test failure | PR blocked | Developer fixes | Before merge |
| Integration test failure | PR blocked | Developer + reviewer investigate | Before merge |
| Contract test failure | PR blocked + alert to API owner | Breaking change review | Before merge |
| E2E test failure | Deploy blocked + alert to team | Team investigation | Within 4 hours |
| Security scan critical | PR blocked + P2 ticket created | Security team review | Within 24 hours |
| Performance regression >10% | Deploy blocked + alert to SRE | Performance investigation | Within 48 hours |
| IQ failure | Deploy rolled back | SRE + QA investigation | Within 1 hour |
| OQ failure | Release held | QA team investigation + deviation report | Within 24 hours |
| PQ failure | Release held + capacity review | SRE + QA + engineering review | Within 48 hours |
7. Coverage Requirements
| Metric | Target | Enforcement | Measurement |
|---|---|---|---|
| Line coverage (overall) | ≥85% | CI gate | Istanbul/NYC (TS), coverage.py (Python) |
| Branch coverage (overall) | ≥80% | CI gate | Same tooling |
| Critical path coverage | 100% | CI gate (stricter threshold for critical modules) | Module-level coverage config |
| State machine guard coverage | 100% | CI gate | Custom coverage report for guard functions |
| RBAC permission coverage | 100% (all role × permission × entity combinations) | CI gate | Custom matrix test generator |
| API endpoint coverage | 100% (every endpoint has ≥1 integration test) | CI gate | OpenAPI spec cross-reference |
| Agent message contract coverage | 100% (every message type has ≥1 contract test) | CI gate | Message schema cross-reference |
| Mutation testing score | ≥70% (stretch: 80%) | Weekly report (not a gate) | Stryker (TS), mutmut (Python) |
| Compliance test coverage | 100% of regulatory requirements in matrix | Pre-release gate | Traceability matrix cross-reference |
In regulated environments, tests aren't just quality checks — they're compliance evidence. Every test that runs in the IQ/OQ/PQ pipeline produces an auditable artifact. Every test failure is a potential deviation that requires documented disposition. The testing strategy is not separate from the compliance strategy — it IS the compliance strategy's execution arm.
Copyright 2026 AZ1.AI Inc. All rights reserved. Developer: Hal Casteel, CEO/CTO Product: CODITECT-BIO-QMS | Part of the CODITECT Product Suite Classification: Internal - Confidential