Skip to main content

ADR-001: Adopt Work Order Management as CODITECT Control Plane Subsystem

Status

Proposed

Context

CODITECT operates in regulated environments (FDA 21 CFR Part 11, HIPAA, SOC 2) where every change to validated systems requires formal Change Control records. Currently, there is no structured mechanism for agents or humans to create, track, and approve change records. This is a blocking gap for regulated industry deployments.

Decision

Implement a Work Order Management subsystem within CODITECT's control plane. The WO system will be the mandatory gateway for all changes to validated systems — both agent-initiated and human-initiated. It will use PostgreSQL as its state store (consistent with CODITECT's existing architecture) and integrate with the Agent Orchestrator, Compliance Engine, and Event Bus.

Consequences

  • Positive: Regulatory compliance becomes structural rather than procedural. Audit trails are generated automatically. Approval gates are enforced at the system level.
  • Positive: Agent-initiated changes gain the same compliance rigor as human-initiated changes.
  • Negative: Every change operation has additional latency (WO creation + approval). Estimated overhead: 50-100ms per operation.
  • Negative: Development effort: ~12 weeks, 2-3 engineers.
  • Neutral: Agents must be taught WO lifecycle semantics — adds complexity to agent prompting but improves auditability.

Alternatives Considered

  1. External CMMS integration (ServiceNow, Maximo): Rejected — adds external dependency, latency, cost, and limits multi-tenancy control. Integration maintenance burden.
  2. Lightweight audit-only approach: Rejected — insufficient for FDA 21 CFR Part 11 which requires formal change control records, not just audit logs.
  3. Per-tenant pluggable WO systems: Rejected — too much surface area variation makes compliance certification impossible.

ADR-002: Master/Linked WO Hierarchy with DAG Dependencies

Status

Proposed

Context

Complex changes (e.g., upgrading a lab instrument's workstation from Windows 10 to Windows 11) involve multiple logically independent tasks that must be coordinated. Some tasks have sequential dependencies (vendor install requires workstation build), while others can run in parallel (user account setup alongside documentation updates).

Decision

Implement a two-level hierarchy: Master WOs contain no execution logic — they serve as coordination records. Linked WOs are the unit of execution, each with its own lifecycle, assignee, job plan, and approval chain. Dependencies between linked WOs form a Directed Acyclic Graph (DAG), validated on creation and periodically checked for integrity.

Consequences

  • Positive: Maps directly to CODITECT's orchestrator-workers pattern. Master WO = Orchestrator, Linked WOs = Workers.
  • Positive: Parallel execution of independent linked WOs is naturally supported.
  • Positive: Critical path analysis is computable from the dependency DAG.
  • Negative: DAG validation adds complexity. Must prevent cycles at creation time and detect corruption.
  • Negative: Two-level only — deeply nested hierarchies not supported. If needed, a linked WO could itself become a master, but this creates a tree, not a DAG.

Alternatives Considered

  1. Flat WO list with manual ordering: Rejected — no dependency enforcement, no critical path analysis, no parallelization support.
  2. Unlimited nesting depth: Rejected — complexity explosion. Two levels cover all known use cases from the source document.
  3. External workflow engine (Temporal, Airflow): Rejected — adds infrastructure dependency. WO dependencies are simple enough for in-process DAG management.

ADR-003: PostgreSQL as WO State Store with Append-Only Audit Trail

Status

Proposed

Context

WO data must be durable, consistent, and auditable. FDA 21 CFR Part 11 requires that audit trails cannot be modified or deleted. CODITECT already uses PostgreSQL as its primary state store.

Decision

Store all WO data in PostgreSQL with row-level security for tenant isolation. The wo_audit_trail table uses an append-only pattern enforced by a database trigger that raises an exception on UPDATE or DELETE. Optimistic locking via a version column prevents lost updates. The audit trail is partitioned by month for query performance and archival.

Consequences

  • Positive: Consistent with CODITECT's existing data strategy — no new database technology.
  • Positive: Append-only enforcement at the database level means application bugs cannot corrupt audit data.
  • Positive: RLS provides strong tenant isolation without application-level filtering.
  • Negative: Audit trail storage grows monotonically. Must implement archival (cold storage after retention period).
  • Negative: PostgreSQL is not optimized for high-frequency append-only writes. At extreme scale (>100K audit entries/minute), may need to consider write-ahead logging optimizations or partitioned inserts.

Alternatives Considered

  1. Event sourcing with Kafka/NATS: Rejected for primary storage — adds operational complexity. Events are still emitted to Event Bus for real-time consumption, but PostgreSQL is the source of truth.
  2. Separate audit database: Rejected — split storage creates consistency risks. Single database with partitioning handles the scale requirements.
  3. Blockchain for audit immutability: Rejected — unnecessary complexity. Database triggers provide equivalent immutability guarantees for this use case.

ADR-004: Deterministic Resource Matching Over ML-Based Assignment

Status

Proposed

Context

WO Job Plans specify required tools, experience levels, and person counts. The system must match these requirements against available resources to suggest or auto-assign WO executors. ML-based matching could optimize for complex criteria but is opaque to auditors.

Decision

Use deterministic, rule-based resource matching: filter by hard requirements (tools, minimum experience rating, certification status) → rank by availability → break ties by experience rating → final tie-break by cost. The ranking logic is configurable per tenant but always auditable — every match decision can be explained by listing the rules applied.

Consequences

  • Positive: Fully auditable — regulators can inspect exactly why a person was assigned to a task.
  • Positive: Predictable — same inputs always produce same outputs.
  • Positive: Simple to test and validate.
  • Negative: May not find globally optimal assignments in complex scenarios (e.g., 50 WOs competing for 20 people).
  • Negative: Cannot learn from historical assignment quality.

Alternatives Considered

  1. ML-based optimization (constraint solver): Rejected for V1 — adds explainability challenges in regulated environments. Could be added as optional layer in V2 with human-in-the-loop verification.
  2. Manual-only assignment: Rejected — doesn't support autonomous agent workflows.

ADR-005: Vault Integration for Job Plan Credentials

Status

Proposed

Context

Job Plans may require account credentials for executing tasks on devices/computers (Admin, Vendor, Superuser, User accounts). The source document indicates credentials are part of the Job Plan. Storing credentials in PostgreSQL JSONB is a critical security risk.

Decision

Job Plans store credential references (vault paths), not credential values. At execution time, the agent worker or human executor retrieves credentials from HashiCorp Vault (or GCP Secret Manager) using the reference. The vault lookup is audited. Credentials have TTL and are rotated independently of the WO lifecycle.

Consequences

  • Positive: No secrets in the database. Compromise of PostgreSQL does not expose credentials.
  • Positive: Credential rotation doesn't require WO updates.
  • Positive: Vault audit log provides additional compliance evidence.
  • Negative: Runtime dependency on Vault availability. Must handle Vault outages gracefully (circuit breaker).
  • Negative: Adds operational complexity — Vault must be deployed, configured, and maintained.

Alternatives Considered

  1. Encrypted JSONB in PostgreSQL: Rejected — key management is complex, doesn't support rotation well, and credential access isn't auditable at the database level.
  2. Inline credentials with application-level encryption: Rejected — same issues as above plus increases blast radius of application compromise.

ADR-006: Event-Driven WO State Notifications

Status

Proposed

Context

Multiple CODITECT subsystems need to react to WO lifecycle events: Compliance Engine (audit validation), Agent Orchestrator (dependency resolution), Observability (metrics), Schedule Service (actuals tracking), Notification Service (alerts).

Decision

Every WO state mutation publishes an event to the CODITECT Event Bus (NATS). Events are typed and include full context (WO ID, previous state, new state, actor, reason). Subscribers are decoupled — the WO service does not know who consumes events. Events are published after the database transaction commits (transactional outbox pattern) to ensure consistency.

Consequences

  • Positive: Loose coupling between WO system and consumers.
  • Positive: New consumers can be added without modifying the WO service.
  • Positive: Compliance monitoring is real-time, not batch.
  • Negative: Eventual consistency between WO state and consumer state. Acceptable for notifications and metrics; Compliance Engine uses direct database queries for enforcement.
  • Negative: Must handle event delivery failures (dead letter queue, retry).

Alternatives Considered

  1. Synchronous service-to-service calls: Rejected — tight coupling, cascade failure risk.
  2. Database polling: Rejected — higher latency, wasteful resource usage.
  3. CDC (Change Data Capture) from PostgreSQL WAL: Considered for V2 — provides guaranteed delivery but adds infrastructure complexity.

ADR-007: Two-Phase Approval with E-Signature for Regulatory WOs

Status

Proposed

Context

FDA 21 CFR Part 11 requires electronic signatures on change control records. Signatures must include signer identity, meaning of signature, and timestamp. Signatures must be linked to the record they sign.

Decision

Regulatory WOs require two-phase approval: (1) System Owner approval, (2) QA approval. Each approval generates an e-signature: SHA-256 hash of (approver_id ∥ wo_id ∥ wo_version ∥ timestamp ∥ role), linked to a PKI-signed certificate. The WO cannot transition to 'closed' until all required signatures are collected. Non-regulatory WOs can use simplified approval (single approver, no e-signature required).

Consequences

  • Positive: Full FDA 21 CFR Part 11 compliance for electronic signatures.
  • Positive: Signature is cryptographically bound to specific WO version — any post-signature modification is detectable.
  • Positive: Flexible — additional approvers can be added per tenant configuration.
  • Negative: PKI infrastructure required — certificate provisioning, revocation checking, key management.
  • Negative: Approval latency impacts WO cycle time. SLA monitoring and escalation required.

Alternatives Considered

  1. Simple boolean approval (approved/rejected): Rejected — insufficient for FDA compliance. No cryptographic binding.
  2. External e-signature service (DocuSign, Adobe Sign): Rejected for V1 — adds cost and external dependency. Could be integration option for tenants who prefer it.