Skip to main content

Work Order QMS Module — Testing & Validation Strategy

Classification: Internal — Quality Engineering Date: 2026-02-13 Artifact: 65 of WO System Series Prompt Section: v8.0 §9 — Testing & Validation Strategy


1. Test Pyramid

1.1 Pyramid Definition

                       ╱╲
╱ ╲ Compliance Validation (IQ/OQ/PQ)
╱ 3% ╲ Full regulatory workflow evidence
╱──────╲
╱ ╲ E2E Tests
╱ 5% ╲ Multi-container, browser-driven
╱────────────╲
╱ ╲ Contract Tests
╱ 8% ╲ API contracts, message schemas
╱──────────────────╲
╱ ╲ Integration Tests
╱ 19% ╲ Database, NATS, Vault, external
╱────────────────────────╲
╱ ╲ Unit Tests
╱ 65% ╲ Pure logic, deterministic, fast
╱──────────────────────────────╲

1.2 Layer Details

LayerScopeTarget CountSpeedRun WhenToolingCompliance Role
UnitSingle function, class, or module~650<10ms eachEvery commit (pre-push hook)Vitest (TS), pytest (Python)Logic correctness evidence
IntegrationComponent + real dependency~190<2s eachEvery PRVitest + Testcontainers, pytest + TestcontainersData integrity, query correctness
ContractAPI endpoint shape, message schema~80<500ms eachEvery PRPact (consumer-driven), JSON Schema validationInterface compliance
E2EFull workflow, multi-container~50<30s eachPre-merge + nightlyPlaywright (browser), supertest (API)Workflow correctness
ComplianceRegulatory evidence generation~30<60s eachPre-release + quarterlyCustom validation harnessIQ/OQ/PQ evidence
Total~1,000

1.3 Unit Test Coverage Targets

ComponentCoverage TargetCritical Paths (100% required)
State machine guards (T1–T8)100%All guard functions, all transition paths
Model Router95%All routing rules, all fallback paths
RBAC permission checks100%All role × permission combinations
SOD enforcement100%All conflict detection rules
DAG cycle detection95%Kahn's algorithm, all edge cases
Optimistic locking95%Version check, conflict detection, retry logic
Audit trail generation100%All entity types × all action types
Hash chain computation100%Hash generation, chain verification, break detection
Token budget controller95%Budget allocation, threshold enforcement, hard stop
Circuit breaker95%State transitions (closed → open → half-open), recovery

1.4 Integration Test Scope

Test CategoryWhat's TestedReal DependenciesMock Dependencies
Database operationsCRUD, RLS enforcement, triggers, migrationsPostgreSQL (Testcontainers)None
Event busPublish, subscribe, ordering, backpressureNATS (Testcontainers)None
Audit trail immutabilityTrigger blocks UPDATE/DELETEPostgreSQL (Testcontainers)None
Cross-tenant isolationRLS prevents cross-tenant readsPostgreSQL (Testcontainers)None
Agent message contractsMessage serialization, validation, routingNATS (Testcontainers)AI model (deterministic stub)
Vault integrationSecret retrieval, rotationVault dev mode (Testcontainers)None
API endpoint behaviorRequest → response, auth, rate limitingExpress (in-process)Database (seeded Testcontainer)

1.5 Contract Tests

ContractProviderConsumerSchema Source
WO REST APIWO ServiceFrontend, Agent WorkersOpenAPI 3.1 spec
Agent messagesAgent WorkersOrchestrator, Compliance EngineTypeScript interfaces (26-agent-message-contracts.md)
Audit trail eventsWO ServiceCompliance Engine, SIEM connectorEvent schema (JSON Schema)
Webhook payloadsWO ServiceExternal subscribersWebhook schema (JSON Schema)
Approval/signature flowSignature ServiceWO Service, FrontendSignature API contract

Contract test verification: provider publishes contract → consumer tests against contract → breaking changes detected before merge.


2. Test Data Management

2.1 Synthetic Data Strategy

Production data is never used in non-production environments. All test data is synthetic.

Data CategoryGeneration StrategyRegulatory ConstraintTooling
Person recordsFaker-generated names, emails, phone numbersMust not match any real individual; must pass format validation@faker-js/faker with deterministic seed
Work ordersTemplate-based generation covering all WO types, statuses, and regulatory flagsMust cover every state machine transition pathCustom seed script (seed-work-orders.ts)
Approval chainsCombinatorial generation of all role × decision pathsMust include SOD-compliant and SOD-violating scenariosCustom generator with constraint solver
Audit trailsGenerated from WO lifecycle execution (not fabricated independently)Must be internally consistent — audit entries match WO transitionsGenerated as side-effect of WO test execution
Asset/tool catalogRealistic bioscience equipment names and categoriesNo patient/subject identifiersStatic fixtures in test/fixtures/assets.json
Multi-tenant dataIdentical schema, different tenant_id valuesMust test RLS isolation between tenantsSeed script creates 3 test tenants
Edge casesProperty-based generation (boundary values, null fields, max-length strings)Must test validation boundariesfast-check property-based testing

2.2 Seed Data Structure

test/
├── fixtures/
│ ├── assets.json # 50 bioscience assets
│ ├── tools.json # 30 tools with calibration data
│ ├── experiences.json # 20 experience/certification types
│ ├── materials.json # 15 material types
│ └── persons.json # 25 persons across all roles
├── factories/
│ ├── work-order.factory.ts # Creates WOs with configurable complexity
│ ├── approval.factory.ts # Creates approval chains (valid and invalid)
│ ├── job-plan.factory.ts # Creates job plans with requirements
│ └── tenant.factory.ts # Creates isolated tenant contexts
├── scenarios/
│ ├── happy-path.scenario.ts # WO: draft → completed → closed
│ ├── regulatory.scenario.ts # Full Part 11 workflow with signatures
│ ├── master-linked.scenario.ts # Master WO with 5 linked WOs + dependencies
│ ├── vendor.scenario.ts # Vendor assignment → execution → evidence
│ ├── rejection.scenario.ts # WO rejected → revision → re-approval
│ ├── cancellation.scenario.ts # WO cancelled at various stages
│ └── concurrent.scenario.ts # Optimistic locking conflict resolution
└── seeds/
├── seed-all.ts # Master seed script (deterministic)
├── seed-minimal.ts # Minimum viable data for unit tests
└── seed-performance.ts # 10,000 WOs for load testing

2.3 Data Isolation Rules

RuleImplementationVerification
No production data in test environmentsNetwork isolation (test env cannot reach production DB)Monthly audit of test DB contents
Deterministic seed dataAll factories use seeded PRNG (faker.seed(42))Same seed → same data (verified in CI)
PHI-free certificationPHI scanner runs on test fixtures before commitCI gate blocks commits with PHI patterns
Tenant isolation in testsEach test suite creates its own tenant contextRLS verification test: cross-tenant query returns 0 rows
Test data cleanupTransactional rollback per test (unit/integration); DB reset per suite (E2E)Post-suite assertion: no orphaned test data

3. Performance Testing

3.1 Performance Budgets

OperationP50 TargetP95 TargetP99 TargetMeasurement
Create WO (single)<50ms<200ms<500msAPI response time
State transition<100ms<300ms<1sAPI response time (excludes approval wait)
List WOs (paginated, 50/page)<100ms<300ms<1sAPI response time
Dependency graph query<50ms<150ms<500msAPI response time
Critical path calculation<200ms<500ms<2sAPI response time
Audit trail query (paginated)<100ms<300ms<1sAPI response time
E-signature creation<200ms<500ms<1sAPI response time (includes hash computation)
Agent dispatch (task → first action)<500ms<1s<3sOrchestrator metric
Batch WO creation (100 WOs)<2s<5s<10sAPI response time
Dashboard render (50 WOs)<1s<2s<5sFrontend time-to-interactive

3.2 Load Profiles

ProfileConcurrent UsersWO VolumeDurationPurpose
Baseline25500 WOs pre-seeded, 50 new/hr1 hourEstablish performance baseline
Normal load1005,000 WOs pre-seeded, 200 new/hr4 hoursTypical enterprise usage
Peak load50010,000 WOs pre-seeded, 1,000 new/hr2 hoursQuarter-end audit preparation
Stress2,00050,000 WOs pre-seeded, 5,000 new/hr30 minFind breaking point
Soak100 (sustained)Continuous creation + transition24 hoursMemory leak detection
Spike0 → 1,000 → 0 (in 60s)Burst of 10,000 transitions10 minAuto-scaling validation

3.3 Performance Test Implementation

// k6 load test: WO lifecycle
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp to 100 users
{ duration: '5m', target: 100 }, // Hold at 100
{ duration: '2m', target: 500 }, // Ramp to peak
{ duration: '5m', target: 500 }, // Hold at peak
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
'http_req_duration{endpoint:create_wo}': ['p(95)<200'],
'http_req_duration{endpoint:transition}': ['p(95)<300'],
'http_req_duration{endpoint:list_wos}': ['p(95)<300'],
http_req_failed: ['rate<0.01'], // <1% error rate
},
};

export default function () {
// Create WO
const createRes = http.post(`${BASE_URL}/api/v1/work-orders`, JSON.stringify({
summary: `Perf test WO ${Date.now()}`,
type: 'AUTOMATION',
regulatory: false,
systemCategoryId: 'test-category',
}), { headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${TOKEN}` },
tags: { endpoint: 'create_wo' } });

check(createRes, { 'WO created': (r) => r.status === 201 });

const woId = createRes.json('id');

// Transition: DRAFT → PLANNED
const transRes = http.patch(`${BASE_URL}/api/v1/work-orders/${woId}/status`,
JSON.stringify({ status: 'PLANNED', reason: 'Performance test' }),
{ headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${TOKEN}` },
tags: { endpoint: 'transition' } });

check(transRes, { 'Transitioned': (r) => r.status === 200 });
sleep(1);
}

3.4 Performance Test Schedule

Test TypeFrequencyEnvironmentGateOwner
Baseline benchmarkEvery releaseStagingRegression > 10% blocks releaseEngineering
Normal loadWeekly (automated)StagingP95 violations create P3 ticketSRE
Stress testMonthlyDedicated perf environmentDocument capacity ceiling, update scaling guidelinesSRE
Soak testQuarterlyDedicated perf environmentMemory growth > 5%/hr blocks releaseSRE
Spike testPer release (if auto-scaling changed)StagingRecovery within 30sSRE

4. Chaos Engineering

4.1 Experiment Catalog

ExperimentMethodExpected BehaviorRecovery TargetCompliance Impact
Kill Agent WorkerTerminate container (random agent)Circuit breaker opens, task re-routed to healthy worker< 30sNo audit trail gaps; in-flight WO transitions preserved
PostgreSQL primary failureForce replica promotionReads continue immediately; writes resume after promotion< 60sZero data loss (synchronous replication for L4); audit trail intact
NATS partitionNetwork partition between NATS nodesBuffered delivery, no message loss; publishers see backpressure< 120sAudit events delayed but not lost; compliance evidence intact
Vault unavailableBlock vault endpointCached credentials used (30min TTL); alert fired; new secret requests fail gracefully< 10s (cache), < 5min (fresh)Agent executions paused if fresh credentials needed
AI model API timeoutInject 30s latency on model endpointFallback to secondary model; timeout after 10s; circuit breaker opens< 5sAgent switches model tier; task continues with different model
Disk full on State StoreFill ephemeral storage to 95%Alert fires; oldest temp files purged; WAL archiving prioritized< 30sNo data loss; new writes may temporarily fail
API Gateway crashKill API Gateway podKubernetes restarts pod; load balancer routes to healthy instances< 15sBrief API unavailability; no data loss
DNS resolution failureDrop DNS for model providerCached DNS used; circuit breaker opens for model calls; queued for retry< 5sAgent execution paused; human checkpoint triggered
Clock skewInject 5-minute clock offset on one nodeTimestamps from skewed node detected via NTP monitoring; affected audit entries flaggedDetection < 60sCompliance alert: server-side timestamps (§11.10(e)) potentially affected
Tenant isolation breach attemptInject cross-tenant RLS bypass attemptQuery returns 0 rows; security alert triggered; attacker session terminatedImmediateP1 incident; forensic investigation initiated

4.2 Chaos Test Schedule

CadenceExperimentsEnvironmentStakeholder Notification
Weekly (automated)Kill Agent Worker, API Gateway crashStagingSRE team (automated report)
MonthlyAll single-component failuresStagingEngineering + SRE review
QuarterlyMulti-component failures (e.g., DB + NATS simultaneously)Dedicated chaos environmentEngineering + SRE + compliance review
AnnuallyFull DR exercise (see operational-readiness.md)Production replicaAll stakeholders + regulatory

4.3 Chaos Engineering Maturity Model

LevelDescriptionWO System Status
L0: No chaosOnly reactive incident responsePassed ✓
L1: GameDayManual fault injection with team presentCurrent ←
L2: AutomatedScheduled automated fault injection in stagingTarget (Q3 2026)
L3: ContinuousContinuous fault injection in production with auto-rollbackFuture (Q1 2027)
L4: Self-healingSystem detects and corrects faults autonomouslyFuture

5. Compliance Validation Automation (IQ/OQ/PQ)

5.1 Qualification Overview

QualificationPurposeAutomation LevelEvidence OutputTrigger
IQ (Installation)Verify correct installation of WO system100% automatedDeployment manifest, config verification, dependency checkEvery deployment
OQ (Operational)Verify correct operation under normal conditions95% automatedTest execution report, expected vs. actual, screenshot evidenceEvery release
PQ (Performance)Verify correct operation under real-world conditions80% automatedPerformance report, SLA compliance, capacity analysisQuarterly + major release

5.2 IQ (Installation Qualification) Test Cases

IQ TestVerificationPass CriteriaEvidence
IQ-001: Database schemaCompare deployed schema hash against expectedHash matchSchema diff report (empty = pass)
IQ-002: RLS policiesVerify all 22 tables have active RLSAll policies activepg_policies query result
IQ-003: Audit trail triggersVerify trigger prevents UPDATE/DELETE on audit_trailTrigger exists and activeTrigger test execution log
IQ-004: Service connectivityVerify all containers can reach dependenciesAll health checks passHealth check response log
IQ-005: TLS configurationVerify TLS 1.3 on all endpointsNo TLS < 1.3 acceptedSSL scan report (testssl.sh)
IQ-006: Vault connectivityVerify service can retrieve test secret from vaultSecret retrieved successfullyVault audit log entry
IQ-007: NATS connectivityVerify publish/subscribe on test channelMessage round-trip < 100msNATS monitoring metrics
IQ-008: Configuration verificationCompare deployed config against approved configConfig hash matchConfig diff report
IQ-009: Container image verificationVerify Cosign signature on all deployed imagesAll signatures validSigstore verification log
IQ-010: Version verificationVerify deployed version matches release manifestVersion matchVersion endpoint response

5.3 OQ (Operational Qualification) Test Cases

OQ TestScenarioPass CriteriaEvidence
OQ-001: WO lifecycle (happy path)Create → Plan → Schedule → Execute → Review → Approve → Complete → CloseAll transitions succeed; audit trail completeFull audit trail export
OQ-002: Regulatory WO with dual approvalCreate regulatory WO → SO approval → QA approval → completionBoth signatures captured with meaning; hash binding validSignature records + hash verification
OQ-003: SOD enforcementAssignee attempts to approve own WOTransition blocked; guard violation loggedGuard violation audit entry
OQ-004: Master/Linked WO hierarchyCreate master + 5 linked WOs with dependenciesDAG valid; critical path calculated; progress tracking accurateDependency graph export + progress report
OQ-005: Cross-tenant isolationTenant A queries for Tenant B's WOs0 results returned; access loggedQuery result + audit log
OQ-006: Optimistic locking conflictTwo users update same WO simultaneouslyOne succeeds, one gets 409 Conflict with diffAPI response logs
OQ-007: Agent WO creationAgent creates and transitions WO within scopeWO created; agent attribution in audit trail; scope enforcedAudit trail with agent session ID
OQ-008: Vendor portal scopingVendor queries WOs outside assignment403 Forbidden; access loggedAPI response + audit log
OQ-009: Audit trail immutabilityAttempt UPDATE on audit_trail via direct SQLUPDATE blocked by trigger; error loggedDatabase error log
OQ-010: E-signature re-authenticationUser signs after session timeoutRe-auth required; new attestation created; signature validRe-auth attestation + signature record
OQ-011: WO cancellationCancel WO at each stageCorrect transitions; proper audit trail; linked WOs notifiedAudit trail + notification log
OQ-012: Resource matchingCreate WO with experience requirements → agent matches personQualified person identified; assignment recommendation validMatch result + audit log

5.4 PQ (Performance Qualification) Test Cases

PQ TestScenarioPass CriteriaEvidence
PQ-001: Normal load performance100 concurrent users, 200 WOs/hr for 4 hoursAll P95 targets met; error rate < 0.1%k6 report + Grafana dashboard snapshot
PQ-002: Peak load performance500 concurrent users, 1,000 WOs/hr for 2 hoursP95 < 500ms; error rate < 1%; auto-scaling triggersk6 report + scaling event log
PQ-003: Audit trail volume100,000 audit entries; query performancePaginated query P95 < 300msQuery execution plan + timing
PQ-004: Concurrent approvals50 simultaneous approval signing eventsAll signatures valid; no hash collision; no race conditionSignature verification report
PQ-005: Agent execution under load20 concurrent agent WO executionsToken budget respected; circuit breakers functional; no agent starvationAgent monitoring dashboard snapshot

5.5 Evidence Package Generation

Every qualification run generates a standardized evidence package:

evidence/
├── iq/
│ ├── IQ-execution-report.json # Machine-readable results
│ ├── IQ-execution-report.pdf # Human-readable with screenshots
│ ├── IQ-traceability-matrix.csv # Requirement → test case → result
│ └── IQ-signature-page.pdf # E-signed by QA reviewer
├── oq/
│ ├── OQ-execution-report.json
│ ├── OQ-execution-report.pdf
│ ├── OQ-traceability-matrix.csv
│ ├── OQ-screenshots/ # UI workflow evidence
│ └── OQ-signature-page.pdf
├── pq/
│ ├── PQ-execution-report.json
│ ├── PQ-execution-report.pdf
│ ├── PQ-grafana-snapshots/ # Performance dashboard captures
│ ├── PQ-k6-reports/ # Load test detailed results
│ └── PQ-signature-page.pdf
└── summary/
├── qualification-summary.pdf # Executive summary of all qualifications
├── deviation-report.pdf # Any failures and disposition
└── release-authorization.pdf # Final sign-off for deployment

6. CI/CD Integration

6.1 Pipeline Gates

Developer Commit


Pre-Push Hook (local)
├── Lint (ESLint/Ruff)
├── Type check (tsc --noEmit)
└── Unit tests (affected files only)


Pull Request (CI)
├── Full unit test suite (~650 tests, <2min)
├── Integration tests (~190 tests, <5min)
├── Contract tests (~80 tests, <1min)
├── Security scan (Snyk/Trivy, block on critical)
├── License check (FOSSA)
├── PHI scan on test data (block if detected)
└── Coverage gate (overall ≥85%, critical paths = 100%)

▼ (all pass → merge allowed)

Main Branch (CI)
├── Full test suite (unit + integration + contract)
├── E2E tests (~50 tests, <15min)
├── Container image build + Cosign signing
├── SBOM generation (CycloneDX)
└── Performance baseline (quick benchmark, 5min)

▼ (all pass → deploy to staging)

Staging Deployment
├── IQ automation (full installation qualification)
├── OQ automation (operational qualification)
├── Smoke tests (critical path only, <2min)
└── Weekly: chaos experiments (automated)

▼ (all pass + QA sign-off → production eligible)

Production Deployment
├── Blue-green deployment
├── IQ automation (production installation verification)
├── Smoke tests (critical path)
├── Canary metrics monitoring (15min)
└── Rollback trigger: error rate >1% or P95 >2s

6.2 Test Failure Response

Failure TypeAutomated ResponseHuman ResponseSLA
Unit test failurePR blockedDeveloper fixesBefore merge
Integration test failurePR blockedDeveloper + reviewer investigateBefore merge
Contract test failurePR blocked + alert to API ownerBreaking change reviewBefore merge
E2E test failureDeploy blocked + alert to teamTeam investigationWithin 4 hours
Security scan criticalPR blocked + P2 ticket createdSecurity team reviewWithin 24 hours
Performance regression >10%Deploy blocked + alert to SREPerformance investigationWithin 48 hours
IQ failureDeploy rolled backSRE + QA investigationWithin 1 hour
OQ failureRelease heldQA team investigation + deviation reportWithin 24 hours
PQ failureRelease held + capacity reviewSRE + QA + engineering reviewWithin 48 hours

7. Coverage Requirements

MetricTargetEnforcementMeasurement
Line coverage (overall)≥85%CI gateIstanbul/NYC (TS), coverage.py (Python)
Branch coverage (overall)≥80%CI gateSame tooling
Critical path coverage100%CI gate (stricter threshold for critical modules)Module-level coverage config
State machine guard coverage100%CI gateCustom coverage report for guard functions
RBAC permission coverage100% (all role × permission × entity combinations)CI gateCustom matrix test generator
API endpoint coverage100% (every endpoint has ≥1 integration test)CI gateOpenAPI spec cross-reference
Agent message contract coverage100% (every message type has ≥1 contract test)CI gateMessage schema cross-reference
Mutation testing score≥70% (stretch: 80%)Weekly report (not a gate)Stryker (TS), mutmut (Python)
Compliance test coverage100% of regulatory requirements in matrixPre-release gateTraceability matrix cross-reference

In regulated environments, tests aren't just quality checks — they're compliance evidence. Every test that runs in the IQ/OQ/PQ pipeline produces an auditable artifact. Every test failure is a potential deviation that requires documented disposition. The testing strategy is not separate from the compliance strategy — it IS the compliance strategy's execution arm.