Skip to main content

Work Order QMS Module — Integration & API Strategy

Classification: Internal — Architecture & Engineering
Date: 2026-02-13
Artifact: 67 of WO System Series
Status: Proposed
Source Artifacts: 12-sdd.md §External Systems, 13-tdd.md §1 APIs, 14-c4-architecture.md §C1/C2, 25-agent-orchestration-spec.md, 27-coditect-impact.md


1. Integration Tier Classification

Not all integrations are equal. Each external system connecting to the WO module is classified by strategic importance, which determines investment level, SLA requirements, and ownership model.

1.1 Tier Matrix

TierClassificationCouplingSLAOwnershipWO System Examples
T1: CorePlatform-defining, revenue-criticalBidirectional, real-time, deep99.99%Build + ownPostgreSQL (state store), NATS (event bus), AI model providers (Claude/OpenAI)
T2: StrategicCompetitive advantage, customer-facingBidirectional, near-real-time99.9%Build adapter + maintainEHR/LIMS (clinical data), CMMS (Maximo/SAP PM), IdP (Auth0/Okta/Azure AD), Vault (secrets)
T3: StandardExpected capability, replaceableUnidirectional or webhook99.5%Adapter pattern, communitySlack/Teams notifications, SIEM (Splunk/Sentinel), Email (SMTP), Calendaring
T4: CommodityUndifferentiated utilityConfiguration-onlyBest effortConfig, not codeSMS (Twilio), File storage (GCS), CDN

1.2 Classification Decision Criteria

For each external system, ask:

1. Revenue dependency: If this integration breaks, do we lose customers?
YES → T1 or T2

2. Compliance dependency: Is this required for regulatory compliance?
YES → Minimum T2

3. Replaceability: Can we swap providers in < 1 sprint?
YES → T3 or T4

4. Competitive moat: Does this integration create switching costs?
YES → T1 or T2

5. Data flow: Does regulated data (L3/L4) cross this boundary?
YES → Minimum T2 (per 63-data-architecture.md §1)

1.3 Current Integration Inventory

SystemTierDirectionData ClassificationProtocolStatus
PostgreSQLT1BidirectionalL4 (all WO data)SQL/Wire✅ Implemented
NATST1BidirectionalL2–L4 (events)NATS protocol✅ Implemented
Anthropic ClaudeT1Outbound (API calls)L2 (prompts, no PHI)HTTPS/REST✅ Implemented
OpenAI (fallback)T1Outbound (API calls)L2 (prompts, no PHI)HTTPS/REST✅ Implemented
Auth0/Okta (IdP)T2Bidirectional (OIDC)L3 (user identity)OIDC/SAML🔧 Architecture ready
HashiCorp VaultT2BidirectionalL3 (secrets)HTTPS/REST⚠️ Gap G01
ServiceNowT2Bidirectional (sync)L2–L3 (tickets, assets)REST API🔧 Adapter designed
Maximo / SAP PMT2Bidirectional (sync)L2 (asset data)REST/OData📋 Planned V2
EHR/LIMST2Read-only importL4 (regulated records)HL7 FHIR / REST📋 Planned V2
Slack/TeamsT3Outbound (notifications)L1 (notification text)Webhook🔧 Plugin ready
SIEM (Splunk/Sentinel)T3Outbound (streaming)L2 (security events)Syslog/CEF⚠️ Gap G13
Email (SMTP/SendGrid)T4OutboundL1 (notifications)SMTP🔧 Config ready

2. API Design Philosophy

2.1 Principles

api_principles:
style: REST (JSON:API for external, internal events via NATS)
versioning: URL path (/api/v1/, /api/v2/) — explicit, discoverable
authentication: OAuth 2.0 + JWT (external); mTLS (service-to-service)
pagination: Cursor-based (opaque cursor, not offset) — consistent under writes
filtering: Query parameter expressions (field[op]=value)
sorting: sort=field:asc,field2:desc
rate_limiting: Token bucket per tenant (burst + sustained)
idempotency: Idempotency-Key header on all POST/PATCH
error_format: RFC 7807 Problem Details
content_type: application/json (default), application/vnd.coditect.v1+json (versioned)

2.2 Error Response Format (RFC 7807)

{
"type": "https://api.coditect.ai/errors/wo/invalid-transition",
"title": "Invalid State Transition",
"status": 422,
"detail": "Cannot transition WO-2024-0042 from DRAFT to COMPLETED. Required path: DRAFT → PLANNED → APPROVED → ASSIGNED → IN_PROGRESS → COMPLETED.",
"instance": "/api/v1/work-orders/wo-2024-0042/status",
"extensions": {
"current_state": "DRAFT",
"requested_state": "COMPLETED",
"valid_transitions": ["PLANNED"],
"correlation_id": "req-a1b2c3d4",
"compliance_rule": "STATE_MACHINE_GUARD_T1"
}
}

Every error includes: machine-parseable type URI, human-readable detail, correlation_id for support, and contextual extensions enabling clients to show actionable recovery.

2.3 Pagination

# Cursor-based (preferred — stable under concurrent writes)
GET /api/v1/work-orders?limit=25&cursor=eyJpZCI6MTIzfQ

Response:
{
"data": [...],
"pagination": {
"cursor": "eyJpZCI6MTQ4fQ", // Opaque, base64-encoded
"has_more": true,
"total_count": 1247 // Only when ?include_count=true (expensive)
}
}

Offset pagination is explicitly not supported — it breaks under concurrent inserts/deletes and creates compliance risks when audit trail records shift between pages.

2.4 Rate Limiting

TierRequests/sec (sustained)BurstConcurrentResponse on Limit
Starter10 req/s305429 + Retry-After header
Professional50 req/s15025429 + Retry-After header
Enterprise200 req/s600100429 + Retry-After header
Agent (internal)500 req/s1500200Circuit breaker + backpressure

Rate limit headers on every response:

X-RateLimit-Limit: 50
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1708300800

2.5 Idempotency

All mutating operations accept Idempotency-Key header:

POST /api/v1/work-orders
Idempotency-Key: client-generated-uuid-v4
Content-Type: application/json

{ "title": "Upgrade HPLC Firmware", ... }

Behavior: first request processes normally; subsequent requests with same key within 24 hours return the original response without re-execution. Idempotency keys stored in Redis with 24-hour TTL. Critical for agent retry patterns and network failure recovery.


3. API Versioning & Lifecycle

3.1 Versioning Strategy

LevelMechanismScopeExample
MajorURL pathBreaking changes/api/v1//api/v2/
MinorRequest headerAdditive changesAPI-Version: 2026-03-01

Breaking change definition for the WO API:

  • Removing a field from response body
  • Changing a field's type or validation rules
  • Removing an endpoint
  • Changing authentication requirements
  • Modifying state machine transitions (compliance impact)

Non-breaking (always safe):

  • Adding optional request field
  • Adding response field
  • Adding new endpoint
  • Adding new enum value to non-exhaustive enums
  • Adding new event type

3.2 Lifecycle Stages

ALPHA → BETA → GA → DEPRECATED → SUNSET
↓ ↓ ↓ ↓ ↓
Internal Partner All 12-month 410 Gone
only access SLA warning + migration
full Sunset: guide
header
StageStabilitySLADocumentationSupport
AlphaMay break without noticeNoneInternal onlyEngineering only
BetaAdditive changes onlyBest-effortPublic with beta bannerStandard
GAVersioned, backward compatibleFull tier SLAFull reference docs + examplesFull tier support
DeprecatedNo new features, bug fixes onlyMaintainedMigration guide publishedMigration support
SunsetEndpoint returns 410 GoneNoneArchivedNone

3.3 Deprecation Policy

  • Minimum notice: 12 months from deprecation announcement to sunset
  • Communication: Sunset header on every response, documentation banner, email to all API consumers, in-app notification for dashboard users
  • Migration support: Published migration guide with before/after examples, SDK migration helpers, automated compatibility checker
  • Compliance constraint: State transitions and audit trail endpoints may never be sunset — they can only be superseded by newer versions that maintain full audit trail continuity
HTTP/1.1 200 OK
Sunset: Sat, 01 Mar 2028 00:00:00 GMT
Deprecation: true
Link: <https://docs.coditect.ai/migration/v1-to-v2>; rel="successor-version"

4. Webhook & Outbound Event Architecture

4.1 Event Catalog

Event TypeTriggerPayload SummaryCompliance Relevance
wo.createdNew WO createdWO ID, type, creator, regulatory flagAudit trail start
wo.transitionState machine transitionWO ID, from_state, to_state, actor, guards passedCore audit event
wo.approval.requestedApproval gate reachedWO ID, required approvers, deadline§11.10(g) checkpoint
wo.approval.completedApproval submittedWO ID, approver, decision, signature hash§11.50 signature event
wo.blockedWO enters BLOCKED stateWO ID, reason, blocking dependenciesEscalation trigger
wo.completedWO reaches COMPLETEDWO ID, duration, all approvals summaryLifecycle completion
wo.compliance.violationGuard function rejects actionWO ID, violation type, rule, actor§11.10 violation event
wo.agent.dispatchedAgent assigned to WO taskWO ID, agent type, model, token budgetAgent audit
wo.agent.completedAgent finishes WO taskWO ID, agent type, result, tokens consumedAgent audit
wo.dependency.resolvedBlocking dependency clearedWO ID, dependency WO ID, resolution typeDAG progression

4.2 Webhook Envelope

interface WebhookPayload {
id: string; // UUID v7 (time-ordered)
type: string; // e.g., 'wo.transition'
version: '1.0'; // Event schema version
timestamp: string; // ISO 8601 UTC
tenant_id: string;
source: 'wo-lifecycle-engine' | 'compliance-engine' | 'agent-orchestrator';
data: Record<string, unknown>; // Event-specific payload
metadata: {
correlation_id: string; // Request trace ID
causation_id: string; // ID of event that caused this event
actor_id: string; // Person ID or agent session ID
actor_type: 'PERSON' | 'AGENT' | 'SYSTEM';
wo_id: string; // Always present for WO events
regulatory: boolean; // Whether this WO is regulatory
};
}

4.3 Delivery Guarantees

GuaranteeImplementation
At-least-onceRetry on failure; receivers MUST be idempotent (use id for dedup)
Ordered per WOEvents for same wo_id delivered in sequence; cross-WO ordering not guaranteed
SignedX-Coditect-Signature: sha256=HMAC(webhook_secret, raw_body)
Retry policy1s, 2s, 4s, 8s, 16s, 32s, 64s, 128s, 256s, 512s (10 retries, ~17 min)
Dead letterAfter max retries: stored in tenant's dead letter queue for 30 days; manual replay available
Timeout10 second response timeout; 2xx = success, anything else = retry

4.4 Webhook Management API

POST   /api/v1/webhooks              # Register webhook endpoint
GET /api/v1/webhooks # List registered webhooks
PATCH /api/v1/webhooks/:id # Update (URL, events, active status)
DELETE /api/v1/webhooks/:id # Deactivate webhook
GET /api/v1/webhooks/:id/deliveries # Delivery history (30 days)
POST /api/v1/webhooks/:id/deliveries/:did/retry # Replay failed delivery
POST /api/v1/webhooks/:id/test # Send test event

5. Plugin & Extension Architecture

5.1 Extension Point Registry

The WO system exposes five extension categories, each with defined boundaries and security constraints:

CategoryExtension PointMechanismSandboxingAudit
Custom FieldsAdditional WO/JobPlan metadataJSONB + tenant JSON SchemaSchema validationField changes in audit trail
Approval ChainsCustom approval logic and routingTenant config rules engineRule evaluation sandboxAll rule evaluations logged
Job Plan TemplatesReusable task definitionsTemplate library (JSONB)Schema validation + param boundsTemplate usage tracked
Notification ChannelsCustom delivery (Slack, PagerDuty, etc.)Event subscriber plugins (NATS)Outbound-only; no data read accessDelivery attempts logged
Resource MatchingCustom prioritization algorithmsStrategy pattern interfaceRead-only access to resource dataMatch decisions logged

5.2 Plugin Security Model

Plugin Permissions:
READ: Can read specific entity types (declared in manifest)
EVENT: Can subscribe to specific event types (declared in manifest)
NOTIFY: Can send outbound notifications (scoped endpoints)

NEVER: Cannot write to regulated entities (WO, Approval, AuditTrail)
NEVER: Cannot bypass compliance guards
NEVER: Cannot access other tenants' data
NEVER: Cannot modify state machine transitions

Plugin Lifecycle:
INSTALL → REVIEW (tenant admin) → ACTIVATE → MONITOR → DEACTIVATE

All plugin actions logged in tenant audit trail
Plugin failures isolated by circuit breaker (per 64-security-architecture.md §5)
Plugin resource consumption metered (API calls, compute time)

5.3 Future: Marketplace Architecture (V3)

Planned for product roadmap Phase 3 (per 11-product-roadmap.md):

Extension Marketplace:
├── Compliance Packs (FDA, ISO 13485, ICH Q10)
├── Integration Connectors (Veeva, SAP, ServiceNow)
├── Report Templates (audit reports, compliance dashboards)
├── Agent Skill Packs (specialized AI agent capabilities)
└── Industry Templates (pharma, meddev, CRO, fintech)

Revenue model: 70/30 split (developer/platform)
Review: Security audit required before listing
Isolation: Each plugin runs in isolated WASM sandbox or container

6. Integration Adapter Architecture

6.1 Adapter Pattern

All T2 and T3 integrations use a common adapter pattern to ensure consistency:

interface IntegrationAdapter<TConfig, TInbound, TOutbound> {
readonly tier: 'T1' | 'T2' | 'T3' | 'T4';
readonly name: string;
readonly version: string;

// Lifecycle
initialize(config: TConfig): Promise<void>;
healthCheck(): Promise<HealthStatus>;
shutdown(): Promise<void>;

// Inbound (external → WO system)
transformInbound(external: TInbound): Promise<WOEntityPartial>;
validateInbound(data: WOEntityPartial): ValidationResult;

// Outbound (WO system → external)
transformOutbound(internal: WOEntity): Promise<TOutbound>;
send(data: TOutbound): Promise<DeliveryResult>;

// Error handling
handleError(error: IntegrationError): Promise<ErrorResolution>;
getRetryPolicy(): RetryPolicy;

// Observability
getMetrics(): IntegrationMetrics;
}

// Example: ServiceNow adapter
class ServiceNowAdapter implements IntegrationAdapter<
ServiceNowConfig,
ServiceNowIncident,
ServiceNowChangeRequest
> {
readonly tier = 'T2';
readonly name = 'servicenow';
readonly version = '1.0.0';
// ... implementation
}

6.2 Data Mapping Rules

RuleRationaleEnforcement
WO system is source of truth for WO statePrevent external systems from corrupting state machineInbound writes rejected if state mismatch
External IDs stored as references, never as primary keysDecouple from external system ID schemesexternal_ref field on relevant entities
Data classification preserved across boundaryL4 data must maintain controls when syncedAdapter validates classification before send
Conflict resolution: last-write-wins with auditSimple, auditable, deterministicVersion field + audit trail entry
Sync failures never block WO transitionsWO lifecycle is primary; sync is secondaryAsync sync with retry queue

6.3 Sync Patterns

PatternUse CaseImplementationExample
Event-driven pushReal-time outbound notificationsNATS → Adapter → External APIWO status → ServiceNow ticket update
Polling pullImport external data on scheduleCron → Adapter → WO APIAsset inventory sync from Maximo
Webhook receiveReal-time inbound notificationsExternal → API Gateway → Adapter → NATSServiceNow closure → WO status check
Batch syncBulk data reconciliationScheduled job → Adapter → Diff → ApplyNightly asset reconciliation

7. Migration Playbooks

7.1 Migration from MasterControl QMS

PhaseActivitiesDurationSuccess Criteria
AssessmentMap MasterControl change control → WO model; identify custom fields; inventory approval chains; count validated systems2 weeksGap analysis complete, data mapping doc approved
Data MigrationExport change records → transform → import as WOs (historical, read-only); map users → Person entities; map equipment → Asset entities4 weeks100% records migrated, validation checksums match
Parallel RunBoth systems active for new changes; CODITECT gets copy of all new change requests; compare outcomes4–8 weeks95% outcome parity on parallel changes
CutoverFreeze MasterControl writes; final sync; redirect users; DNS/SSO update1 dayAll users on CODITECT, no new MasterControl entries
DecommissionMasterControl to read-only; archive after retention period; update validation docs4 weeksMasterControl access revoked, data archived

Compliance considerations: FDA-regulated migration requires a validation protocol (VP), execution evidence, and a deviation report for any discrepancies. The migration itself is a change that requires a WO in the new system.

7.2 Migration from ServiceNow ITSM

PhaseKey Differences from MasterControlDuration
AssessmentServiceNow data model is ticket-centric (not WO-centric); map incident/change/task → WO hierarchy2 weeks
Data MigrationServiceNow has rich CMDB; map CI → Asset; map assignment groups → Teams3 weeks
Parallel RunServiceNow likely stays for non-QMS tickets; integration adapter handles coexistence6–12 weeks
CutoverOnly QMS-related change control moves; ServiceNow remains for IT operations1 day

7.3 Greenfield (No Existing QMS)

PhaseActivitiesDuration
DiscoveryInventory validated systems, current paper/spreadsheet processes, approval chains1 week
ConfigurationSet up tenant, roles, approval chains, asset registry, job plan templates2 weeks
ValidationExecute IQ/OQ with representative WO scenarios2 weeks
Go-LiveFirst production WO, monitored closely1 week
OptimizationPM automation setup, agent training, vendor portal activation4 weeks

7.4 Data Migration Toolkit

interface MigrationJob {
id: string;
source: 'mastercontrol' | 'servicenow' | 'veeva' | 'csv' | 'custom';
target_tenant_id: string;
status: 'PENDING' | 'VALIDATING' | 'MIGRATING' | 'VERIFYING' | 'COMPLETE' | 'FAILED';
config: {
field_mapping: Record<string, string>; // source_field → wo_field
value_transforms: Record<string, Transform>; // field → transformation function
skip_rules: SkipRule[]; // conditions to skip records
dedup_strategy: 'SKIP' | 'UPDATE' | 'VERSION';
};
progress: {
total_records: number;
processed: number;
succeeded: number;
failed: number;
skipped: number;
};
validation: {
checksum_match: boolean;
record_count_match: boolean;
field_coverage: number; // % of source fields mapped
compliance_verified: boolean; // audit trail continuity verified
};
}

8. Cross-Reference to Other Artifacts

TopicPrimary ArtifactIntegration Strategy Section
API endpoint list13-tdd.md §1.1§2 (philosophy layer on top of endpoint list)
Event topics13-tdd.md §1.2§4 (webhook delivery on top of internal events)
Extension points13-tdd.md §1.3§5 (security model + marketplace vision)
External system context12-sdd.md §1, 14-c4-architecture.md §C1§1 (tier classification)
Agent message contracts26-agent-message-contracts.md§4 (agent events in webhook catalog)
Security boundaries64-security-architecture.md §5§5.2 (plugin security)
Data classification at boundary63-data-architecture.md §1§6.2 (classification preserved across boundary)
Competitive migration08-competitive-moat-analysis.md§7 (migration playbooks per competitor)

Integration strategy is not a technical appendix — it's a business capability. Every new customer asks: "Does it work with our existing systems?" Every retention conversation includes: "How hard would it be to leave?" The answers to both questions live in this document.