Work Order QMS Module — Security Architecture
Classification: Internal — Security Engineering
Date: 2026-02-13
Artifact: 64 of WO System Series
Prompt Section: v8.0 §5 — Security Architecture
1. Threat Model (STRIDE)
1.1 System Boundary
The WO system's attack surface spans six boundaries: the API Gateway (external-facing), the Agent Orchestrator (internal, trusted), the Compliance Engine (internal, trusted), Agent Workers (internal, semi-trusted), the Vendor Portal (external-facing, limited trust), and the State Store (internal, highest trust).
1.2 STRIDE Analysis
Spoofing (Identity)
| Attack Vector | Target | Likelihood | Impact | Mitigation | Detection | Status |
|---|
| Stolen JWT used to access API | API Gateway | Medium | High — attacker acts as authenticated user | Short-lived JWTs (1hr), refresh token rotation, device fingerprinting | Failed auth monitoring, IP anomaly detection | ✅ Designed |
| Agent token reuse across WO executions | Agent Workers | Low | High — agent acts outside intended scope | Ephemeral per-execution tokens scoped to WO ID, token invalidated on WO completion | Token reuse detection in audit trail | ✅ Designed |
| Vendor impersonation via shared credentials | Vendor Portal | Medium | Medium — unauthorized WO modifications | Per-vendor unique credentials, MFA required, IP allowlisting optional | Login anomaly detection, geo-mismatch alerts | ✅ Designed |
| Forged e-signature (identity claim) | Signature Service | Low | Critical — invalidates regulatory compliance | Re-authentication at signing time (§11.100(b)), cryptographic hash binding (§11.70) | Hash verification on every signature read, chain integrity audit | ⚠️ Partial (G02, G05) |
Tampering (Integrity)
| Attack Vector | Target | Likelihood | Impact | Mitigation | Detection | Status |
|---|
| Direct DB modification bypassing application | State Store | Low (requires DB admin access) | Critical — violates audit trail integrity | PostgreSQL triggers prevent UPDATE/DELETE on audit_trail; separate DB credentials for app vs. admin | Hash chain verification (nightly job), chain break = immediate P1 alert | ⚠️ Partial (G03) |
| WO field modification after approval | WO Service | Low | High — approved record no longer matches what was approved | Optimistic locking (version field), post-approval fields immutable (application-enforced) | Version mismatch detection, audit trail diff on every mutation | ✅ Designed |
| Agent message tampering in transit | Event Bus (NATS) | Low | Medium — agent acts on false instructions | HMAC-SHA256 message signing, nonce-based replay prevention | Signature verification on receipt, sequence gap detection | ⚠️ Partial (G04) |
| Malicious schema migration | State Store | Very Low | Critical — corrupts regulated data | Migration requires approval (ADR link), pre-migration snapshot, tested rollback | Schema hash comparison, migration audit log | ✅ Designed |
Repudiation (Non-repudiation)
| Attack Vector | Target | Likelihood | Impact | Mitigation | Detection | Status |
|---|
| User denies approving WO | Approval/Signature | Medium | High — regulatory compliance failure | ElectronicSignature with re-auth attestation, cryptographic hash binding, immutable audit trail | Signature chain verification, re-auth log correlation | ⚠️ Partial (G02, G05) |
| Agent denies performing action | Agent Workers | Low | Medium — audit gap | Agent session ID in every audit trail entry, message signing, correlation ID chain | Agent execution trace in observability stack | ✅ Designed |
| Admin denies configuration change | Tenant Settings | Low | Medium — accountability gap | Admin actions generate L4 audit entries with re-authentication | Admin audit trail review (weekly) | ✅ Designed |
| Attack Vector | Target | Likelihood | Impact | Mitigation | Detection | Status |
|---|
| Cross-tenant data leakage | State Store | Low (RLS enforced) | Critical — regulatory violation, trust destruction | PostgreSQL RLS on every table, tenant_id set at connection pool level, RLS penetration tested quarterly | Cross-tenant access attempt logging, automated RLS policy verification | ✅ Implemented |
| PHI exposure in WO descriptions | WO Service | Medium | High — HIPAA violation | PHI detection scanner on WO creation/update, confidence-based response (block/flag/log) | PHI scan results dashboard, false negative review | ⚠️ Design Only (G09) |
| Vendor sees non-assigned WO data | Vendor Portal | Low | Medium — confidentiality breach | Vendor role scoped to assigned WOs only (RBAC + application-level filtering) | Vendor access audit (monthly), access pattern anomaly | ✅ Implemented |
| Credential exposure in job plan | JobPlan Service | Medium | High — lateral movement risk | Vault references only (vault://path), never plaintext; PHI scanner catches credential patterns | Credential pattern detection in L2+ fields | ⚠️ Design Only (G01) |
| Audit trail data exfiltration | API / Export | Low | Critical — bulk regulated data exposure | Export requires AUDITOR or ADMIN role, rate limited, logged, paginated (no bulk dump) | Export volume anomaly detection | ✅ Designed |
Denial of Service (Availability)
| Attack Vector | Target | Likelihood | Impact | Mitigation | Detection | Status |
|---|
| API rate abuse | API Gateway | High | Medium — service degradation | Per-tenant token bucket rate limiting, burst + sustained rates | Rate limit hit monitoring, auto-scaling triggers | ✅ Designed |
| Agent execution storm (infinite loop) | Agent Orchestrator | Medium | High — token budget exhaustion, system overload | Token budget controller (hard stop at 95%), max iteration limits, circuit breakers per agent | Budget threshold alerts (80%), iteration count monitoring | ✅ Implemented |
| Database connection exhaustion | State Store | Low | Critical — system-wide outage | Connection pooling (PgBouncer), per-tenant connection limits, query timeout (30s) | Connection pool saturation alerts, slow query logging | ✅ Designed |
| Event bus flood | NATS | Low | Medium — message processing delay | Per-agent publish rate limits, message size limits (1MB), backpressure signaling | Queue depth monitoring, consumer lag alerts | ✅ Designed |
Elevation of Privilege (Authorization)
| Attack Vector | Target | Likelihood | Impact | Mitigation | Detection | Status |
|---|
| Agent attempts to approve WO (self-elevation) | Agent Workers → Approval | Low (architectural constraint) | Critical — bypasses human control | Agents NEVER hold SYSTEM_OWNER, QA, or ADMIN roles; approval endpoints reject agent tokens | Agent-attempted-approval alert (immediate P1) | ✅ Implemented |
| ASSIGNEE approves their own WO | Approval Service | Medium (user error or intent) | High — SOD violation, Part 11 breach | SOD guard: ASSIGNEE ≠ APPROVER enforced in state machine guard T5 | SOD violation audit log, blocked transition logged | ✅ Implemented |
| Admin bypasses approval chain | Admin Console | Low | High — undermines regulatory workflow | ADMIN role can cancel but cannot approve/reject; documented in RBAC matrix | Admin action audit (all admin operations logged at L4) | ✅ Implemented |
| Break-glass abuse (over-broad emergency access) | Break-Glass System | Low | Medium — unauthorized access under emergency cover | 4-hour time limit, enhanced audit logging, mandatory post-incident review within 72 hours, break-glass does not bypass SOD | Break-glass activation alert (immediate), usage pattern analysis | ⚠️ Design Only (G10) |
1.3 Threat Model Summary
| STRIDE Category | Total Vectors | ✅ Implemented | ✅ Designed | ⚠️ Partial/Design | Coverage |
|---|
| Spoofing | 4 | 0 | 3 | 1 | 75% |
| Tampering | 4 | 0 | 2 | 2 | 50% |
| Repudiation | 3 | 0 | 2 | 1 | 67% |
| Information Disclosure | 5 | 2 | 1 | 2 | 60% |
| Denial of Service | 4 | 1 | 3 | 0 | 100% |
| Elevation of Privilege | 4 | 3 | 0 | 1 | 75% |
| Total | 24 | 6 | 11 | 7 | 71% |
The 7 partial/design-only items map directly to gap closure prompts G01–G05, G09, G10.
2. Authentication Architecture
2.1 Authentication Flow
┌──────────────────────┐
│ Identity Provider │
│ (Okta / Azure AD / │
│ Auth0 / Cognito) │
└──────────┬───────────┘
│ OIDC / SAML 2.0
┌──────────▼───────────┐
│ API Gateway │
│ ┌─────────────────┐ │
│ │ Token Validator │ │
│ │ (JWT RS256) │ │
│ └─────────────────┘ │
└──────────┬───────────┘
│ Validated Claims
┌────────────────┼────────────────┐
▼ ▼ ▼
┌───────────┐ ┌────────────┐ ┌────────────┐
│ Human │ │ Service │ │ Agent │
│ Sessions │ │ Accounts │ │ Tokens │
│ │ │ │ │ │
│ JWT + │ │ mTLS + │ │ Scoped, │
│ Refresh │ │ API Key │ │ Ephemeral │
└───────────┘ └────────────┘ └────────────┘
2.2 Authentication Types
| Auth Type | Mechanism | Lifetime | Scope | Rotation | WO System Use |
|---|
| Human session | JWT (RS256) + refresh token | 1hr access / 7d refresh / 30min idle timeout | Tenant + roles | Refresh on use | All human API calls |
| E-signature re-auth | Re-authentication attestation | 5 minutes | Single signature | Per-signature | Approval signing events |
| Service-to-service | mTLS + API key | Certificate: 90d | Service identity | Auto-rotate at 60d | Compliance Engine ↔ State Store |
| Agent execution | Scoped ephemeral token | WO execution duration | WO ID + agent role | Per-execution | All agent API calls |
| Vendor portal | JWT (RS256) + MFA | 1hr access / no refresh (re-login required) | Assigned WOs only | Per-session | Vendor interactions |
| Break-glass | Emergency override token | 4 hours max | Specified scope (not SOD bypass) | Single-use | Emergency access only |
2.3 Session Management
| Parameter | Value | Regulatory Requirement |
|---|
| Idle timeout | 30 minutes (configurable: 5–120 min) | HIPAA §164.312(a)(2)(iii) |
| Absolute timeout | 12 hours | Security best practice |
| E-signature window | 5 minutes | FDA §11.100(b) |
| Concurrent sessions | 3 max per user | Security best practice |
| Grace warning | 2 minutes before idle timeout | UX requirement |
| Failed login lockout | 5 attempts → 15 minute lockout | HIPAA §164.312(a)(1) |
| Re-auth for signatures | Every signature event | FDA §11.100(b), §11.200 |
3. Authorization Architecture
3.1 Layered Model
Layer 1: RBAC ──→ "Does this role have this permission?"
│ 8 roles: ORIGINATOR, ASSIGNER, ASSIGNEE, SYSTEM_OWNER,
│ QA, VENDOR, ADMIN, AUDITOR
│ 6 agent roles: AGENT_ORCHESTRATOR, AGENT_ASSET_MGMT,
│ AGENT_SCHEDULER, AGENT_VENDOR_COORD, AGENT_DOCUMENTATION,
│ AGENT_QA_ASSIST
▼
Layer 2: RLS ──→ "Can this tenant see this row?"
│ PostgreSQL RLS on all 22 tables
│ tenant_id = current_setting('app.tenant_id')
▼
Layer 3: SOD ──→ "Does this create a conflict of interest?"
│ ASSIGNEE ≠ APPROVER
│ Both SO + QA required for regulatory WOs
│ Agents never approve
▼
Layer 4: Scope ──→ "Can this actor access THIS specific resource?"
│ Vendors: only assigned WOs
│ Agents: only current execution scope
│ Auditors: read-only everything in tenant
▼
Layer 5: Context ─→ "Do special conditions apply?"
Break-glass: bypasses RBAC (not SOD)
Training expiration: blocks assignment
Certification lapse: blocks execution
3.2 Policy Decision Flow
async function authorize(request: AuthzRequest): Promise<AuthzDecision> {
const rolePermission = await checkRBAC(request.actorRole, request.permission);
if (!rolePermission.allowed) {
return deny('RBAC', `Role ${request.actorRole} lacks ${request.permission}`);
}
const tenantMatch = request.actorTenantId === request.resourceTenantId;
if (!tenantMatch) {
return deny('RLS', 'Cross-tenant access denied');
}
if (request.permission === 'APPROVE_WO') {
const isAssignee = await isActorAssignee(request.actorId, request.resourceId);
if (isAssignee) {
return deny('SOD', 'Assignee cannot approve own WO (§11.10(g))');
}
}
if (request.actorRole === 'VENDOR') {
const isAssignedVendor = await isVendorAssigned(request.actorId, request.resourceId);
if (!isAssignedVendor) {
return deny('SCOPE', 'Vendor not assigned to this WO');
}
}
if (request.contextFlags?.breakGlass) {
await logBreakGlassAccess(request);
}
return allow(request, [rolePermission]);
}
3.3 Agent Permission Boundaries
| Constraint | Enforcement Point | Consequence of Violation |
|---|
| Agents cannot hold SO, QA, or ADMIN roles | Token issuer (Orchestrator) | Token rejected at API Gateway |
| Agents cannot approve or reject WOs | State machine guard (T5) | Guard violation, human checkpoint triggered |
| Agents cannot sign electronically | Signature service | Request rejected, P1 alert |
| Agent scope limited to current WO execution | Token claims include WO ID | Requests outside scope return 403 |
| Agent actions always attributed | Audit trail includes agent session ID | Agent actions auditable end-to-end |
4. Secrets Management
4.1 Secret Inventory
| Secret | Classification | Storage | Rotation | Current Status |
|---|
| Database connection strings | L3 | Vault (HashiCorp / GCP Secret Manager) | 90 days | ⚠️ Gap G01 — currently env vars |
| AI model API keys | L3 | Vault | Per provider policy (90d default) | ⚠️ Gap G01 |
| JWT signing keys | L3 | KMS (cloud-native) | Annual + on-demand | ✅ Designed |
| E-signature hash keys | L4 | HSM / Cloud KMS | Versioned (never rotated — new version created) | ⚠️ Gap G02 |
| Agent execution tokens | L2 | In-memory (ephemeral) | Per-execution | ✅ Designed |
| mTLS certificates | L2 | Cert-manager (automated) | 90 days (Let's Encrypt) | ✅ Designed |
| Encryption keys (at-rest) | L4 | KMS (cloud-native) | Annual | ✅ Designed |
| NATS credentials | L2 | Vault | 90 days | ⚠️ Gap G01 |
| Vendor portal OAuth client secrets | L3 | Vault | 180 days | ⚠️ Gap G01 |
4.2 Vault Integration Pattern (Gap G01)
Application Code → Vault Sidecar → Vault Server → Secret Value
Job Plan credential reference:
Before (gap): { "db_password": "plaintextvalue" }
After (G01): { "db_password": "vault://secret/wo-system/db/prod#password" }
Resolution flow:
1. Agent needs credential → reads vault reference from JobPlan
2. Agent requests scoped token from Orchestrator (includes WO ID + credential path)
3. Vault sidecar resolves reference → returns value in memory
4. Value used for operation → never persisted outside vault
5. Vault audit log records: who accessed, when, which secret, from which WO
4.3 Key Management
| Key Type | Algorithm | Key Size | Storage | Rotation Trigger |
|---|
| JWT signing | RSA | 2048-bit | KMS | Annual or compromise |
| E-signature hash | SHA-256 | 256-bit | KMS/HSM | Version-based (new key per year, old keys retained for verification) |
| Audit trail hash chain | SHA-256 | 256-bit | KMS | Never rotated (chain integrity) |
| At-rest encryption | AES-256-GCM | 256-bit | KMS | Annual (envelope encryption, rotate DEK) |
| Message signing (agent-to-agent) | HMAC-SHA256 | 256-bit | Vault (ephemeral per session) | Per agent session |
5. Network Security
5.1 Network Boundaries
┌─────────────────────────────────────────────────┐
│ PUBLIC INTERNET │
│ ┌────────────────────────────────────────────┐ │
│ │ DMZ (WAF + DDoS Protection) │ │
│ │ ┌──────────────────────────────────────┐ │ │
│ │ │ API Gateway (TLS termination) │ │ │
│ │ │ Vendor Portal (TLS termination) │ │ │
│ │ └──────────────┬───────────────────────┘ │ │
│ └─────────────────┼─────────────────────────┘ │
│ │ mTLS │
│ ┌─────────────────▼─────────────────────────┐ │
│ │ PRIVATE NETWORK (VPC) │ │
│ │ ┌──────────┐ ┌──────────┐ ┌───────────┐ │ │
│ │ │ Agent │ │Compliance│ │ Observ. │ │ │
│ │ │Orchestr. │ │ Engine │ │ Stack │ │ │
│ │ └────┬─────┘ └────┬─────┘ └───────────┘ │ │
│ │ │ mTLS │ mTLS │ │
│ │ ┌────▼─────────────▼────────────────────┐ │ │
│ │ │ DATA PLANE (most restricted) │ │ │
│ │ │ ┌───────────┐ ┌──────────┐ │ │ │
│ │ │ │PostgreSQL │ │ NATS │ │ │ │
│ │ │ │(encrypted │ │(TLS + │ │ │ │
│ │ │ │ at rest) │ │ authz) │ │ │ │
│ │ │ └───────────┘ └──────────┘ │ │ │
│ │ └───────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────┘
5.2 Network Policies
| Rule | Source | Destination | Protocol | Port | Justification |
|---|
| Internet → API Gateway | Any | API Gateway | HTTPS | 443 | Public API access |
| Internet → Vendor Portal | Vendor IP allowlist (optional) | Vendor Portal | HTTPS | 443 | Vendor access |
| API Gateway → Orchestrator | API Gateway | Agent Orchestrator | gRPC over mTLS | 8443 | Internal routing |
| Orchestrator → Agent Workers | Agent Orchestrator | Agent Workers | gRPC over mTLS | 8444 | Agent dispatch |
| Any service → PostgreSQL | Private VPC services | PostgreSQL | TLS | 5432 | Data access |
| Any service → NATS | Private VPC services | NATS cluster | TLS | 4222 | Event bus |
| PostgreSQL → External | PostgreSQL | None | — | — | No outbound (data plane isolated) |
| Agent Workers → AI Models | Agent Workers | Anthropic/OpenAI API | HTTPS | 443 | Model calls (via egress proxy) |
5.3 Zero Trust Principles
| Principle | WO System Implementation |
|---|
| Never trust, always verify | Every request authenticated + authorized, even internal |
| Least privilege | Tokens scoped to minimum required access; agent tokens scoped to WO |
| Assume breach | Audit everything; hash chains detect tampering; circuit breakers limit blast radius |
| Explicit verification | mTLS between all services; no implicit trust based on network position |
| Encrypt everything | TLS 1.3 in transit; AES-256-GCM at rest; field-level for L3+ |
6. Supply Chain Security
6.1 Dependency Management
| Control | Tool | Frequency | Gate Type |
|---|
| Vulnerability scanning | Snyk / Trivy | Every PR + daily scan | Block on critical/high CVEs |
| License compliance | FOSSA | Every PR | Block copyleft in proprietary components |
| Dependency pinning | Lock files (package-lock.json, poetry.lock) | Always | Exact versions only |
| Controlled updates | Renovate (configured for grouped weekly PRs) | Weekly | PR with changelog + test results |
| Transitive dependency audit | Snyk deep scan | Monthly | Review report, create WO for remediation |
6.2 Build Artifact Security
| Artifact | Signing | Storage | Verification |
|---|
| Container images | Cosign (Sigstore) | Private registry (GCR/ECR) | Admission controller verifies signature before deployment |
| Helm charts | GPG signed | Private chart repository | Signature verified before helm install |
| Database migrations | SHA-256 hash in migration manifest | Git (source of truth) | Hash verified before execution |
| SBOM | Auto-generated (CycloneDX format) | Stored alongside build artifact | Included in IQ evidence package |
6.3 Base Image Policy
| Allowed Base | Use Case | Update Cadence |
|---|
gcr.io/distroless/cc-debian12 | Service containers (Go, compiled languages) | Monthly rebuild |
gcr.io/distroless/python3-debian12 | Python services (Orchestrator, Compliance Engine) | Monthly rebuild |
node:22-slim | TypeScript services (API Gateway, IDE) | Monthly rebuild |
postgres:16-alpine | Database (dev/test only; managed service in production) | Quarterly |
Rejected: Ubuntu/Debian full images, latest tags, unverified third-party images.
7. Incident Response Integration
7.1 Security Event Taxonomy
| Event Category | Source | Severity | Response |
|---|
| Authentication failure (≥5 in 5min) | API Gateway | P3 | Auto-lockout + alert to security team |
| Cross-tenant access attempt | RLS / Application | P1 | Immediate block + forensic investigation |
| SOD violation attempt | State machine guard | P2 | Block + log + notify compliance officer |
| Hash chain integrity failure | Nightly verification job | P1 | Freeze affected records + forensic investigation |
| Agent attempted approval | Signature service | P1 | Block + alert + review agent configuration |
| PHI detected in non-PHI field | PHI scanner | P2 | Flag record + notify data owner + quarantine |
| Token budget exhaustion | Token Budget Controller | P3 | Hard stop agent execution + alert orchestrator |
| Circuit breaker open | Agent Worker monitoring | P3 | Route around failed worker + alert SRE |
| Break-glass activation | Break-glass system | P2 | Enhanced audit logging + mandatory 72-hour review |
| Credential rotation failure | Vault integration | P2 | Retry with backoff + alert security team + use cached credential |
7.2 Security Event → WO Creation (Gap G14)
Critical security events auto-generate incident Work Orders:
Security Event (P1/P2)
→ Incident WO created automatically
→ Type: MANUAL (source_type override: SECURITY_INCIDENT)
→ Priority: EMERGENCY
→ Assigned to: Security Team (pre-configured)
→ Regulatory flag: true (all security incidents are regulatory-relevant)
→ JobPlan: pre-populated from incident template
→ Mandatory QA review before closure
→ Correlation: incident WO linked to triggering event via correlationId
8. Residual Risk Register
| Risk ID | Description | STRIDE Category | Severity | Probability | Mitigation Status | Acceptance Criteria | Review Date |
|---|
| SR-001 | Plaintext credentials in JobPlan JSONB | Disclosure | Critical | Medium | ⚠️ Gap G01 | Resolved when vault integration complete | Immediate |
| SR-002 | No cryptographic hash binding on e-signatures | Repudiation | High | Low | ⚠️ Gap G02 | Resolved when hash function implemented | Immediate |
| SR-003 | Audit trail hash chain not implemented | Tampering | High | Low | ⚠️ Gap G03 | Resolved when chain verification active | Immediate |
| SR-004 | Agent messages unsigned | Tampering | Medium | Low | ⚠️ Gap G04 | Resolved when HMAC signing active | Next sprint |
| SR-005 | No PHI scanner on WO fields | Disclosure | High | Medium | ⚠️ Gap G09 | Resolved when scanner operational | Next sprint |
| SR-006 | Break-glass not implemented | Privilege | Medium | Low | ⚠️ Gap G10 | Resolved when break-glass system live | Next quarter |
| SR-007 | AI model provider processes L4 data | Disclosure | Medium | Low | Contractual (BAA/DPA) | BAA/DPA signed with all model providers | Annually |
| SR-008 | Single-region deployment (no DR) | Availability | Medium | Low | ⚠️ DR gap | Resolved when multi-region deployed | Next quarter |
| SR-009 | Insider threat (malicious admin) | All categories | Medium | Very Low | Admin audit trail + SOD + no admin approval | Accepted — residual risk with quarterly access review | Quarterly |
Risk review cadence: Monthly for Critical/High, quarterly for Medium, annually for Low.
Security is not a feature — it's a property of the system. Every new endpoint, every new agent capability, every new data flow must pass through this STRIDE analysis and authorization framework before deployment. The gap closure series (G01–G10) addresses the 7 partial items identified in this threat model.
Copyright 2026 AZ1.AI Inc. All rights reserved.
Developer: Hal Casteel, CEO/CTO
Product: CODITECT-BIO-QMS | Part of the CODITECT Product Suite
Classification: Internal - Confidential