SOC 2 Continuous Monitoring Controls

Document ID: CODITECT-BIO-SOC2-MON-001 Version: 1.0.0 Effective Date: 2026-02-16 Classification: Internal - Restricted Owner: Chief Information Security Officer (CISO) / Chief Compliance Officer (CCO)

Document Control

Approval History

Role	Name	Signature	Date
Chief Information Security Officer	[Pending]	[Digital Signature]	YYYY-MM-DD
Chief Compliance Officer	[Pending]	[Digital Signature]	YYYY-MM-DD
VP Engineering	[Pending]	[Digital Signature]	YYYY-MM-DD
VP Quality Assurance	[Pending]	[Digital Signature]	YYYY-MM-DD
Internal Audit Manager	[Pending]	[Digital Signature]	YYYY-MM-DD

Revision History

Version	Date	Author	Changes	Approval Status
1.0.0	2026-02-16	CISO Office	Initial release	Draft

Distribution List

Executive Leadership Team
Information Security Team
Compliance Team
Internal Audit
External SOC 2 Auditor (during audit engagement)
Quality Assurance Team
Engineering Leadership
DevOps/SRE Team

Review Schedule

Review Type	Frequency	Next Review Date	Responsible Party
Annual Review	12 months	2027-02-16	CISO
Quarterly Control Effectiveness Review	3 months	2026-05-16	Compliance Team
Post-Audit Review	After each SOC 2 audit	N/A	CCO + CISO
Technology Stack Review	12 months	2027-02-16	VP Engineering
KPI Threshold Review	6 months	2026-08-16	Compliance Team

1. Executive Summary

1.1 Purpose

This SOC 2 Continuous Monitoring Controls specification establishes comprehensive, automated monitoring and control validation capabilities for the CODITECT Biosciences Quality Management System (BIO-QMS) Platform to ensure:

Real-Time Control Effectiveness - Continuous validation that SOC 2 controls are operating as designed
Proactive Risk Detection - Immediate identification and alerting on control failures or security events
Audit Readiness - Automated evidence collection and control test results always available for SOC 2 Type II audits
Operational Excellence - System health, availability, and security metrics meet or exceed SLA commitments
Regulatory Compliance - Full conformance with AICPA Trust Services Criteria (TSC) 2017 monitoring requirements

1.2 Scope

This specification applies to:

In Scope:

All SOC 2 Trust Services Criteria controls (CC1-CC9, A1, PI1, C1)
Centralized logging infrastructure (application, audit, access, error, system logs)
Real-time security event alerting system
Automated control health checks and validation
SOC 2 control effectiveness dashboard and reporting
Compliance KPI tracking and trending
Integration with HIPAA audit logging (D.3.4) for PHI access monitoring
Evidence collection and retention for SOC 2 Type II audits
Incident response integration and escalation management

Out of Scope:

Application performance monitoring (APM) for optimization (covered in observability architecture)
Business intelligence and analytics (covered in reporting architecture)
Cost optimization monitoring (covered in FinOps processes)
Penetration testing and vulnerability scanning (covered in D.4.1 - separate specification)

1.3 Regulatory Context

SOC 2 Trust Services Criteria (TSC) 2017 - Monitoring Requirements:

TSC Category	Criteria	Monitoring Requirement	Implementation
CC7.1	System Operations	Monitor system components and quality of operations	Real-time infrastructure health checks
CC7.2	System Operations	Detect and resolve incidents/anomalies	Automated alerting + incident response integration
CC7.3	System Operations	Evaluate and respond to incidents	Escalation matrix + response SLAs
CC7.4	System Operations	Identify and manage changes	Change log monitoring + approval verification
CC8.1	Change Management	Authorize and approve changes	Pre/post-change control validation
A1.1	Availability	Monitor system availability	Uptime monitoring + SLA compliance tracking
A1.2	Availability	Respond to availability incidents	Availability alerting + recovery time tracking
C1.1	Confidentiality	Restrict access to confidential info	Access control monitoring + anomaly detection
C1.2	Confidentiality	Dispose of confidential info	Secure deletion verification + audit logging

NIST SP 800-137 - Continuous Monitoring:

Define continuous monitoring strategy aligned with risk tolerance
Establish metrics, status assessments, and data feeds
Implement technical mechanisms for automated data collection
Analyze data and report findings to support risk-based decisions
Respond to findings with prioritized corrective actions

2. Centralized Logging Architecture

2.1 Log Aggregation Design

2.1.1 Google Cloud Logging Integration

Architecture Overview:

┌─────────────────────────────────────────────────────────────────┐
│                    BIO-QMS Application Layer                     │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐│
│  │  NestJS    │  │  Frontend  │  │  Worker    │  │  Batch     ││
│  │  Backend   │  │  Angular   │  │  Services  │  │  Jobs      ││
│  └─────┬──────┘  └─────┬──────┘  └─────┬──────┘  └─────┬──────┘│
│        │ JSON           │ JSON           │ JSON           │ JSON  │
│        └────────────────┴────────────────┴────────────────┘      │
└────────────────────────────┬────────────────────────────────────┘
                             │
                    ┌────────▼────────┐
                    │  Cloud Logging  │
                    │   Agent (ops)   │
                    └────────┬────────┘
                             │
         ┌───────────────────┼───────────────────┐
         │                   │                   │
┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐
│ Cloud Logging   │ │   BigQuery      │ │  Cloud Storage  │
│  (30 day hot)   │ │ (1 yr online)   │ │  (7 yr archive) │
│                 │ │                 │ │                 │
│ - Query/Search  │ │ - SQL analysis  │ │ - Compliance    │
│ - Real-time     │ │ - Dashboards    │ │ - Long-term     │
│ - Alerts        │ │ - Reporting     │ │ - Immutable     │
└─────────────────┘ └─────────────────┘ └─────────────────┘
         │                   │                   │
         └───────────────────┼───────────────────┘
                             │
                   ┌─────────▼─────────┐
                   │  Log Integrity    │
                   │  Verification     │
                   │  (SHA-256 chain)  │
                   └───────────────────┘

Implementation Details:

Component	Technology	Configuration
Log Collection	Google Cloud Logging Agent	Installed on all GKE nodes, configured via ConfigMap
Structured Logging	Winston (NestJS)	JSON format with consistent schema
Hot Storage	Cloud Logging	30-day retention, full-text search, real-time streaming
Online Archive	BigQuery	1-year retention, SQL queries, dashboard integration
Long-Term Archive	Cloud Storage (GCS)	7-year retention, immutable bucket, lifecycle policy
Log Router	Cloud Logging Sinks	Filter-based routing to BigQuery, GCS, alerting
Integrity Verification	Cloud Functions + Cloud KMS	SHA-256 hash chain, hourly verification

2.1.2 Log Categories and Retention

Log Types with Retention Requirements:

Log Type	Description	Retention (Hot)	Retention (Online)	Retention (Archive)	Part 11 Required
Application Logs	NestJS structured JSON logs (info, warn, error)	30 days	1 year	7 years	Yes
Audit Logs	PHI access, user actions, system changes (immutable)	30 days	1 year	7 years	Yes
Access Logs	HTTP requests, API calls, authentication events	30 days	1 year	7 years	Yes
Security Logs	Failed auth, privilege escalation, anomalies	30 days	1 year	7 years	Yes
Error Logs	Application exceptions, stack traces, errors	30 days	1 year	7 years	Yes
System Logs	OS, Kubernetes, infrastructure events	30 days	1 year	3 years	No
Performance Logs	Latency, throughput, resource utilization	30 days	90 days	N/A	No
Change Logs	Code deployments, configuration changes, migrations	30 days	1 year	7 years	Yes

Retention Rationale:

30-day hot storage: Immediate investigation, real-time alerting, operational troubleshooting
1-year online archive: SOC 2 Type II audit period (12 months of control evidence), HIPAA compliance
7-year long-term archive: FDA 21 CFR Part 11 §11.10(e) requirement for electronic record retention

2.1.3 Structured Log Schema

Standard Application Log Format (Winston/NestJS):

interface ApplicationLogEntry {
  // Timestamp
  timestamp: string;                    // ISO 8601 UTC (e.g., "2026-02-16T14:30:45.123Z")

  // Severity
  severity: 'DEBUG' | 'INFO' | 'WARN' | 'ERROR' | 'CRITICAL';

  // Source identification
  service: string;                      // "nestjs-backend", "angular-frontend", "worker-service"
  component: string;                    // Module/component name (e.g., "UserController", "AuthService")
  instance_id: string;                  // Container/pod ID for distributed tracing

  // Message
  message: string;                      // Human-readable log message

  // Context
  context: {
    user_id?: string;                   // Authenticated user (if applicable)
    organization_id?: string;           // Tenant organization
    session_id?: string;                // Session identifier
    request_id?: string;                // Unique request ID for distributed tracing
    ip_address?: string;                // Source IP (hashed if PII concern)
    user_agent?: string;                // Browser/client user agent
    correlation_id?: string;            // Cross-service correlation
  };

  // Technical details
  stack_trace?: string;                 // For ERROR/CRITICAL severity
  error_code?: string;                  // Application error code

  // Compliance
  is_phi_related: boolean;              // Flag for PHI-related operations
  is_audit_event: boolean;              // Flag for audit-required events

  // Metadata
  labels: Record<string, string>;       // Custom key-value labels
  resource: {
    type: string;                       // "gke_container", "cloud_function", etc.
    labels: Record<string, string>;     // GCP resource labels
  };
}

Example Application Log Entry:

{
  "timestamp": "2026-02-16T14:30:45.123Z",
  "severity": "INFO",
  "service": "nestjs-backend",
  "component": "DocumentController",
  "instance_id": "backend-7d9f8c-xk2p5",
  "message": "Document created successfully",
  "context": {
    "user_id": "usr_1a2b3c4d",
    "organization_id": "org_5e6f7g8h",
    "session_id": "sess_9i0j1k2l",
    "request_id": "req_3m4n5o6p",
    "ip_address": "10.1.2.34",
    "user_agent": "Mozilla/5.0..."
  },
  "is_phi_related": false,
  "is_audit_event": false,
  "labels": {
    "action": "create",
    "resource_type": "document",
    "document_id": "doc_7q8r9s0t"
  },
  "resource": {
    "type": "gke_container",
    "labels": {
      "cluster_name": "bio-qms-prod",
      "namespace_name": "default",
      "pod_name": "backend-7d9f8c-xk2p5"
    }
  }
}

Audit Log Format (HIPAA/Part 11 Compliance):

Audit logs extend the application log schema with additional fields per hipaa-audit-reporting.md specification:

interface AuditLogEntry extends ApplicationLogEntry {
  // Audit-specific fields
  audit_id: string;                      // UUID v4
  audit_event_type: string;              // "PHI_ACCESS_READ", "USER_LOGIN", "CONFIG_CHANGE", etc.
  actor: {
    user_id: string;
    username: string;
    role: string;
    organization_id: string;
  };
  target: {
    resource_type: string;               // "patient_record", "user_account", "system_config"
    resource_id: string;
    resource_name?: string;
  };
  action: string;                        // "read", "write", "delete", "export", "approve"
  outcome: 'SUCCESS' | 'FAILURE' | 'PARTIAL';
  phi_fields_accessed?: string[];        // List of PHI field names (if applicable)
  reason?: string;                       // Justification for access (if required)

  // Immutability
  hash_previous: string;                 // SHA-256 of previous audit entry (blockchain-style)
  hash_current: string;                  // SHA-256 of this entry
  signature?: string;                    // Digital signature (optional, for critical events)
}

2.2 Log Integrity Protection

2.2.1 SHA-256 Hash Chain for Tamper Detection

Hash Chain Implementation:

Audit Entry 1:
  - audit_id: "aud_001"
  - timestamp: "2026-02-16T10:00:00Z"
  - data: { user_id: "usr_123", action: "read", ... }
  - hash_previous: "0000000000000000"  (genesis entry)
  - hash_current: SHA256(audit_id + timestamp + data + hash_previous)
                = "a3f5c9e1..."

Audit Entry 2:
  - audit_id: "aud_002"
  - timestamp: "2026-02-16T10:00:15Z"
  - data: { user_id: "usr_456", action: "write", ... }
  - hash_previous: "a3f5c9e1..."  (hash_current from Entry 1)
  - hash_current: SHA256(audit_id + timestamp + data + hash_previous)
                = "7b2d8f4a..."

[If Entry 1 is tampered with, hash_current changes, breaking the chain]

Verification Process:

Hourly Batch Verification - Cloud Function triggered every hour
- Retrieve all audit entries from the past hour from BigQuery
- Recompute hash_current for each entry using hash_previous from prior entry
- Compare recomputed hash with stored hash
- Alert if mismatch detected (indicates tampering)
On-Demand Verification - Via API or admin console
- Allows auditors to verify integrity of specific date range
- Generates verification report with pass/fail status
- Signed with Cloud KMS key for non-repudiation

Cloud Function Implementation:

// cloud-functions/log-integrity-verifier/index.ts
import { BigQuery } from '@google-cloud/bigquery';
import { createHash } from 'crypto';

export async function verifyAuditLogIntegrity(request, response) {
  const bigquery = new BigQuery();

  // Query audit logs from past hour, ordered by timestamp
  const query = `
    SELECT audit_id, timestamp, data, hash_previous, hash_current
    FROM \`project.dataset.audit_logs\`
    WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
    ORDER BY timestamp ASC
  `;

  const [rows] = await bigquery.query(query);

  let previousHash = null;
  const tamperedEntries = [];

  for (const row of rows) {
    // Compute expected hash
    const dataString = JSON.stringify(row.data);
    const expectedHash = createHash('sha256')
      .update(row.audit_id + row.timestamp + dataString + (previousHash || '0000000000000000'))
      .digest('hex');

    // Compare with stored hash
    if (expectedHash !== row.hash_current) {
      tamperedEntries.push({
        audit_id: row.audit_id,
        timestamp: row.timestamp,
        expected_hash: expectedHash,
        stored_hash: row.hash_current
      });
    }

    previousHash = row.hash_current;
  }

  if (tamperedEntries.length > 0) {
    // CRITICAL ALERT - Log tampering detected
    await sendCriticalAlert('AUDIT_LOG_TAMPERING', {
      count: tamperedEntries.length,
      entries: tamperedEntries
    });
  }

  response.json({
    verified_entries: rows.length,
    tampered_entries: tamperedEntries.length,
    status: tamperedEntries.length === 0 ? 'PASS' : 'FAIL',
    details: tamperedEntries
  });
}

2.2.2 Immutable Log Storage

Cloud Storage Configuration for Archive Tier:

# terraform/storage.tf
resource "google_storage_bucket" "audit_log_archive" {
  name     = "bio-qms-audit-logs-archive-${var.environment}"
  location = "US"

  # Immutability - prevent deletion or modification
  retention_policy {
    retention_period = 220752000  # 7 years in seconds
    is_locked        = true       # Cannot be removed once locked
  }

  # Versioning (for accidental overwrites before lock)
  versioning {
    enabled = true
  }

  # Lifecycle policy - transition to coldline after 1 year
  lifecycle_rule {
    condition {
      age = 365
    }
    action {
      type          = "SetStorageClass"
      storage_class = "COLDLINE"
    }
  }

  # Prevent public access
  uniform_bucket_level_access = true

  # Encryption
  encryption {
    default_kms_key_name = google_kms_crypto_key.audit_log_key.id
  }

  # Audit logging for the audit logs (meta!)
  logging {
    log_bucket = "bio-qms-admin-logs"
  }
}

BigQuery Dataset Configuration for Online Archive:

-- Create dataset with 1-year retention
CREATE SCHEMA IF NOT EXISTS `bio-qms-prod.audit_logs`
OPTIONS (
  location = 'US',
  default_table_expiration_ms = 31536000000  -- 1 year in milliseconds
);

-- Create audit_logs table with partitioning
CREATE TABLE IF NOT EXISTS `bio-qms-prod.audit_logs.entries` (
  audit_id STRING NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  audit_event_type STRING NOT NULL,
  severity STRING NOT NULL,
  actor STRUCT<
    user_id STRING,
    username STRING,
    role STRING,
    organization_id STRING
  >,
  target STRUCT<
    resource_type STRING,
    resource_id STRING,
    resource_name STRING
  >,
  action STRING,
  outcome STRING,
  data JSON,
  hash_previous STRING,
  hash_current STRING,
  signature STRING
)
PARTITION BY DATE(timestamp)
CLUSTER BY organization_id, audit_event_type
OPTIONS (
  require_partition_filter = true,
  partition_expiration_days = 365
);

2.3 Log Routing and Filtering

2.3.1 Cloud Logging Sinks Configuration

Log Router Sinks (Terraform):

# terraform/logging.tf

# Sink 1: All audit logs to BigQuery (online archive)
resource "google_logging_project_sink" "audit_logs_bigquery" {
  name        = "audit-logs-to-bigquery"
  destination = "bigquery.googleapis.com/projects/${var.project_id}/datasets/audit_logs"

  filter = <<-EOT
    (jsonPayload.is_audit_event = true)
    OR
    (labels.audit_event_type != "")
    OR
    (resource.type = "gke_container" AND severity >= "WARNING")
  EOT

  unique_writer_identity = true
  bigquery_options {
    use_partitioned_tables = true
  }
}

# Sink 2: All logs to Cloud Storage (long-term archive)
resource "google_logging_project_sink" "all_logs_storage" {
  name        = "all-logs-to-storage"
  destination = "storage.googleapis.com/${google_storage_bucket.audit_log_archive.name}"

  filter = "resource.type = \"gke_container\""  # All GKE container logs

  unique_writer_identity = true
}

# Sink 3: Security events to real-time alerting
resource "google_logging_project_sink" "security_events_pubsub" {
  name        = "security-events-to-pubsub"
  destination = "pubsub.googleapis.com/projects/${var.project_id}/topics/security-events"

  filter = <<-EOT
    (jsonPayload.audit_event_type =~ "FAILED_AUTH.*")
    OR
    (jsonPayload.audit_event_type = "PRIVILEGE_ESCALATION")
    OR
    (jsonPayload.audit_event_type = "PHI_ACCESS_ANOMALY")
    OR
    (severity >= "ERROR" AND jsonPayload.is_phi_related = true)
  EOT

  unique_writer_identity = true
}

# Sink 4: PHI access logs (HIPAA compliance)
resource "google_logging_project_sink" "phi_access_bigquery" {
  name        = "phi-access-to-bigquery"
  destination = "bigquery.googleapis.com/projects/${var.project_id}/datasets/phi_access_logs"

  filter = <<-EOT
    jsonPayload.is_phi_related = true
    AND
    jsonPayload.audit_event_type =~ "PHI_.*"
  EOT

  unique_writer_identity = true
  bigquery_options {
    use_partitioned_tables = true
  }
}

2.3.2 Log Sampling and Rate Limiting

High-Volume Log Sampling Configuration:

// src/common/logging/winston.config.ts
import { format, transports } from 'winston';

// Custom sampling format for high-volume debug logs
const samplingFormat = format((info) => {
  // Sample DEBUG logs at 10% (keep 1 in 10)
  if (info.level === 'debug') {
    const shouldKeep = Math.random() < 0.1;
    return shouldKeep ? info : false;
  }

  // Always keep INFO and above
  return info;
});

export const loggerConfig = {
  level: process.env.LOG_LEVEL || 'info',
  format: format.combine(
    samplingFormat(),
    format.timestamp({ format: 'YYYY-MM-DDTHH:mm:ss.SSSZ' }),
    format.errors({ stack: true }),
    format.json()
  ),
  transports: [
    // Console for Cloud Logging to pick up
    new transports.Console({
      level: 'debug',
      handleExceptions: true,
      handleRejections: true
    })
  ],
  // Exception/rejection handling
  exceptionHandlers: [
    new transports.Console()
  ],
  rejectionHandlers: [
    new transports.Console()
  ]
};

Rate Limiting for Repeated Errors:

// src/common/logging/rate-limiter.ts
import { createHash } from 'crypto';

class LogRateLimiter {
  private counters: Map<string, { count: number; firstSeen: number }> = new Map();
  private readonly windowMs = 60000;  // 1 minute
  private readonly maxPerWindow = 10;

  shouldLog(message: string, severity: string): boolean {
    // Never rate-limit CRITICAL or audit events
    if (severity === 'CRITICAL' || severity === 'AUDIT') {
      return true;
    }

    // Create hash key from message content
    const key = createHash('md5').update(message).digest('hex');

    const now = Date.now();
    const counter = this.counters.get(key);

    if (!counter) {
      this.counters.set(key, { count: 1, firstSeen: now });
      return true;
    }

    // Reset if window expired
    if (now - counter.firstSeen > this.windowMs) {
      this.counters.set(key, { count: 1, firstSeen: now });
      return true;
    }

    // Increment counter
    counter.count++;

    // Allow up to maxPerWindow, then log summary message
    if (counter.count <= this.maxPerWindow) {
      return true;
    } else if (counter.count === this.maxPerWindow + 1) {
      // Log one final message indicating rate limiting
      console.warn({
        message: `Log rate limit reached for message: "${message.substring(0, 100)}..."`,
        suppressed_count: 1,
        window_ms: this.windowMs
      });
      return false;
    } else {
      // Suppress
      return false;
    }
  }

  // Cleanup old entries every 5 minutes
  cleanup() {
    const now = Date.now();
    for (const [key, counter] of this.counters.entries()) {
      if (now - counter.firstSeen > this.windowMs * 5) {
        this.counters.delete(key);
      }
    }
  }
}

export const rateLimiter = new LogRateLimiter();
setInterval(() => rateLimiter.cleanup(), 300000);  // 5 minutes

3. Real-Time Security Event Alerting

3.1 Alert Definitions and Thresholds

3.1.1 Authentication Failure Alerts

Alert Rule: Failed Authentication Attempts

Parameter	Value
Alert Name	`FAILED_AUTH_THRESHOLD_EXCEEDED`
Trigger Condition	≥5 failed authentication attempts from same user OR same IP in 10-minute window
Severity	HIGH
Data Source	Cloud Logging (filter: `audit_event_type = "FAILED_AUTH"`)
Detection Method	Log-based metric with threshold alert
Response SLA	15 minutes
Escalation	Security team → CISO (if unresolved in 30 min)
Auto-Remediation	Temporary IP block (1 hour) after 10 failed attempts

Implementation (Terraform + Monitoring):

# terraform/monitoring.tf

# Log-based metric: count failed auth by user
resource "google_logging_metric" "failed_auth_by_user" {
  name   = "failed_auth_by_user_count"
  filter = "jsonPayload.audit_event_type=\"FAILED_AUTH\""

  metric_descriptor {
    metric_kind = "DELTA"
    value_type  = "INT64"
    labels {
      key         = "user_id"
      value_type  = "STRING"
      description = "User attempting authentication"
    }
    labels {
      key         = "ip_address"
      value_type  = "STRING"
      description = "Source IP address"
    }
  }

  label_extractors = {
    "user_id"    = "EXTRACT(jsonPayload.actor.user_id)"
    "ip_address" = "EXTRACT(jsonPayload.context.ip_address)"
  }
}

# Alert policy: trigger when threshold exceeded
resource "google_monitoring_alert_policy" "failed_auth_alert" {
  display_name = "Failed Authentication Threshold Exceeded"
  combiner     = "OR"

  conditions {
    display_name = "Failed auth count ≥ 5 in 10 minutes"

    condition_threshold {
      filter          = "metric.type=\"logging.googleapis.com/user/failed_auth_by_user_count\""
      duration        = "600s"  # 10 minutes
      comparison      = "COMPARISON_GT"
      threshold_value = 5

      aggregations {
        alignment_period     = "600s"
        per_series_aligner  = "ALIGN_SUM"
        group_by_fields     = ["metric.label.user_id", "metric.label.ip_address"]
      }
    }
  }

  notification_channels = [
    google_monitoring_notification_channel.pagerduty_security.name,
    google_monitoring_notification_channel.slack_security.name
  ]

  documentation {
    content = <<-EOT
      **Alert:** Failed Authentication Threshold Exceeded
      **Severity:** HIGH
      **Action Required:**
      1. Investigate source IP and user account
      2. Check for credential stuffing or brute force attack
      3. Verify account is not compromised
      4. Consider temporary account lock if suspicious

      **Runbook:** https://wiki.internal/security/failed-auth-response
    EOT
  }

  alert_strategy {
    auto_close = "86400s"  # 24 hours
  }
}

3.1.2 Privilege Escalation Alerts

Alert Rule: Privilege Escalation Attempts

Parameter	Value
Alert Name	`PRIVILEGE_ESCALATION_DETECTED`
Trigger Condition	Any attempt to elevate privileges or access role-restricted resources without authorization
Severity	CRITICAL
Data Source	Cloud Logging (filter: `audit_event_type = "PRIVILEGE_ESCALATION"`)
Detection Method	Real-time log streaming to Pub/Sub → Cloud Function
Response SLA	Immediate (5 minutes)
Escalation	CISO + Security team (immediate notification)
Auto-Remediation	Session termination + account suspension pending investigation

Implementation (Cloud Function):

// cloud-functions/privilege-escalation-detector/index.ts
import { PubSub } from '@google-cloud/pubsub';
import { sendPagerDutyAlert, sendSlackAlert } from './alerting';
import { terminateUserSession, suspendUserAccount } from './remediation';

export async function handlePrivilegeEscalation(message: any, context: any) {
  const logEntry = JSON.parse(Buffer.from(message.data, 'base64').toString());

  if (logEntry.jsonPayload?.audit_event_type === 'PRIVILEGE_ESCALATION') {
    const {
      user_id,
      username,
      session_id,
      attempted_role,
      current_role,
      ip_address,
      timestamp
    } = logEntry.jsonPayload;

    // CRITICAL ALERT
    await sendPagerDutyAlert({
      severity: 'critical',
      summary: `Privilege escalation attempt by ${username} (${user_id})`,
      details: {
        user_id,
        username,
        session_id,
        current_role,
        attempted_role,
        ip_address,
        timestamp
      },
      incident_key: `privilege_escalation_${user_id}_${Date.now()}`
    });

    await sendSlackAlert({
      channel: '#security-incidents',
      severity: 'critical',
      title: 'PRIVILEGE ESCALATION DETECTED',
      message: `User \`${username}\` attempted to escalate from \`${current_role}\` to \`${attempted_role}\` at ${timestamp}`,
      fields: [
        { title: 'User ID', value: user_id, short: true },
        { title: 'Session ID', value: session_id, short: true },
        { title: 'IP Address', value: ip_address, short: true },
        { title: 'Current Role', value: current_role, short: true },
        { title: 'Attempted Role', value: attempted_role, short: true }
      ]
    });

    // AUTO-REMEDIATION
    console.log(`Terminating session ${session_id} for user ${user_id}`);
    await terminateUserSession(session_id);

    console.log(`Suspending account ${user_id} pending investigation`);
    await suspendUserAccount(user_id, {
      reason: 'Privilege escalation attempt detected',
      timestamp,
      auto_suspended: true
    });
  }
}

3.1.3 PHI Access Anomaly Alerts

Alert Rule: PHI Access Anomalies

Parameter	Value
Alert Name	`PHI_ACCESS_ANOMALY_DETECTED`
Trigger Condition	Anomaly detection rules from D.3.4 HIPAA Audit specification
Severity	HIGH
Data Source	BigQuery ML anomaly model + real-time log stream
Detection Method	Machine learning model scoring access patterns
Response SLA	30 minutes
Escalation	Privacy Officer + Security team
Auto-Remediation	Flag for manual review (no auto-block due to false positive risk)

Anomaly Detection Rules (from D.3.4):

After-Hours Access: PHI access outside business hours (6 PM - 8 AM, weekends)
Bulk Access: >50 patient records accessed in single session
Geographic Anomaly: Access from unexpected location (IP geolocation)
Role-Inappropriate Access: User accesses records outside their department
Terminated Employee Access: Access attempt by deactivated account
Export Anomaly: Unusual PHI export volume or frequency
Access Velocity: >10 different patient records in 5 minutes

BigQuery ML Model for Anomaly Scoring:

-- Create anomaly detection model (trained on historical access patterns)
CREATE OR REPLACE MODEL `bio-qms-prod.ml_models.phi_access_anomaly`
OPTIONS (
  model_type = 'AUTOML_CLASSIFIER',
  input_label_cols = ['is_anomaly']
) AS
SELECT
  user_id,
  role,
  EXTRACT(HOUR FROM timestamp) AS hour_of_day,
  EXTRACT(DAYOFWEEK FROM timestamp) AS day_of_week,
  COUNT(*) OVER (
    PARTITION BY user_id, DATE(timestamp)
  ) AS daily_access_count,
  COUNT(*) OVER (
    PARTITION BY user_id
    ORDER BY UNIX_SECONDS(timestamp)
    RANGE BETWEEN 300 PRECEDING AND CURRENT ROW
  ) AS access_velocity_5min,
  ST_GEOGPOINT(ip_longitude, ip_latitude) AS access_location,
  is_anomaly  -- Historical labels for training
FROM `bio-qms-prod.phi_access_logs.entries`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY);

-- Real-time scoring query (triggered every 5 minutes)
SELECT
  audit_id,
  user_id,
  username,
  timestamp,
  anomaly_probability,
  CASE
    WHEN anomaly_probability > 0.8 THEN 'CRITICAL'
    WHEN anomaly_probability > 0.6 THEN 'HIGH'
    WHEN anomaly_probability > 0.4 THEN 'MEDIUM'
    ELSE 'LOW'
  END AS risk_level
FROM ML.PREDICT(
  MODEL `bio-qms-prod.ml_models.phi_access_anomaly`,
  (
    SELECT * FROM `bio-qms-prod.phi_access_logs.entries`
    WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 5 MINUTE)
  )
)
WHERE anomaly_probability > 0.6;  -- Alert threshold

3.1.4 System Availability Alerts

Alert Rule: System Availability Degradation

Parameter	Value
Alert Name	`SYSTEM_AVAILABILITY_SLA_BREACH`
Trigger Condition	System availability drops below 99.9% SLA threshold
Severity	CRITICAL
Data Source	Cloud Monitoring uptime checks + load balancer metrics
Detection Method	Uptime check failure OR error rate >0.1%
Response SLA	Immediate (5 minutes)
Escalation	On-call SRE → VP Engineering
Auto-Remediation	Auto-scaling triggers + failover to DR region (if multi-region)

Uptime Check Configuration:

# terraform/monitoring.tf

resource "google_monitoring_uptime_check_config" "bio_qms_frontend" {
  display_name = "BIO-QMS Frontend HTTPS"
  timeout      = "10s"
  period       = "60s"

  http_check {
    path           = "/health"
    port           = 443
    use_ssl        = true
    validate_ssl   = true
    request_method = "GET"

    accepted_response_status_codes {
      status_value = 200
    }
  }

  monitored_resource {
    type = "uptime_url"
    labels = {
      project_id = var.project_id
      host       = "bio-qms.coditect.ai"
    }
  }

  content_matchers {
    content = "\"status\":\"healthy\""
    matcher = "CONTAINS_STRING"
  }
}

resource "google_monitoring_uptime_check_config" "bio_qms_api" {
  display_name = "BIO-QMS API /health"
  timeout      = "10s"
  period       = "60s"

  http_check {
    path           = "/api/health"
    port           = 443
    use_ssl        = true
    validate_ssl   = true
    request_method = "GET"

    accepted_response_status_codes {
      status_value = 200
    }
  }

  monitored_resource {
    type = "uptime_url"
    labels = {
      project_id = var.project_id
      host       = "api.bio-qms.coditect.ai"
    }
  }
}

# Alert on uptime check failure
resource "google_monitoring_alert_policy" "uptime_check_failure" {
  display_name = "BIO-QMS Uptime Check Failure (Availability SLA)"
  combiner     = "OR"

  conditions {
    display_name = "Uptime check failed"

    condition_threshold {
      filter          = "metric.type=\"monitoring.googleapis.com/uptime_check/check_passed\" AND resource.type=\"uptime_url\""
      duration        = "300s"  # 5 minutes
      comparison      = "COMPARISON_LT"
      threshold_value = 1  # Expect 1 (success), alert if < 1

      aggregations {
        alignment_period     = "60s"
        per_series_aligner  = "ALIGN_FRACTION_TRUE"
        cross_series_reducer = "REDUCE_MEAN"
      }
    }
  }

  notification_channels = [
    google_monitoring_notification_channel.pagerduty_oncall.name,
    google_monitoring_notification_channel.slack_incidents.name
  ]

  documentation {
    content = <<-EOT
      **Alert:** System Availability SLA Breach
      **Severity:** CRITICAL
      **SLA Target:** 99.9% (43.2 minutes downtime/month max)

      **Immediate Actions:**
      1. Check GKE cluster health: kubectl get nodes
      2. Check pod status: kubectl get pods -n default
      3. Check load balancer: gcloud compute forwarding-rules list
      4. Review recent deployments (possible rollback needed)
      5. Check Cloud SQL status

      **Escalation:** If unresolved in 15 minutes, page VP Engineering
      **Runbook:** https://wiki.internal/sre/availability-incident-response
    EOT
  }
}

3.1.5 Certificate Expiration Alerts

Alert Rule: TLS Certificate Expiration Warning

Parameter	Value
Alert Name	`TLS_CERTIFICATE_EXPIRING`
Trigger Condition	Certificate expires in ≤90 days (warnings at 90/60/30/7 days)
Severity	MEDIUM (90d), HIGH (30d), CRITICAL (7d)
Data Source	Cloud Function scanning all TLS endpoints daily
Detection Method	X.509 certificate expiration date parsing
Response SLA	90 days: 7 days, 30 days: 24 hours, 7 days: 4 hours
Escalation	Security team → CISO (at 7-day threshold)
Auto-Remediation	Trigger cert renewal workflow (Let's Encrypt/Google-managed)

Certificate Monitoring Cloud Function:

// cloud-functions/certificate-monitor/index.ts
import * as tls from 'tls';
import { sendAlert } from './alerting';

interface CertificateEndpoint {
  name: string;
  host: string;
  port: number;
}

const endpoints: CertificateEndpoint[] = [
  { name: 'Frontend', host: 'bio-qms.coditect.ai', port: 443 },
  { name: 'API', host: 'api.bio-qms.coditect.ai', port: 443 },
  { name: 'Admin', host: 'admin.bio-qms.coditect.ai', port: 443 }
];

export async function checkCertificates() {
  const now = new Date();
  const results = [];

  for (const endpoint of endpoints) {
    const cert = await getCertificate(endpoint.host, endpoint.port);
    const expirationDate = new Date(cert.valid_to);
    const daysUntilExpiry = Math.floor((expirationDate.getTime() - now.getTime()) / (1000 * 60 * 60 * 24));

    results.push({
      endpoint: endpoint.name,
      host: endpoint.host,
      expiration_date: expirationDate.toISOString(),
      days_until_expiry: daysUntilExpiry,
      issuer: cert.issuer,
      subject: cert.subject
    });

    // Alert thresholds
    let severity: string | null = null;
    if (daysUntilExpiry <= 7) {
      severity = 'critical';
    } else if (daysUntilExpiry <= 30) {
      severity = 'high';
    } else if (daysUntilExpiry <= 60) {
      severity = 'medium';
    } else if (daysUntilExpiry <= 90) {
      severity = 'low';
    }

    if (severity) {
      await sendAlert({
        severity,
        summary: `TLS certificate for ${endpoint.name} (${endpoint.host}) expires in ${daysUntilExpiry} days`,
        details: {
          endpoint: endpoint.name,
          host: endpoint.host,
          expiration_date: expirationDate.toISOString(),
          days_until_expiry: daysUntilExpiry,
          issuer: cert.issuer.O,
          subject: cert.subject.CN
        }
      });
    }
  }

  // Store results in BigQuery for historical tracking
  await storeCertificateCheckResults(results);

  return { checked: endpoints.length, results };
}

function getCertificate(host: string, port: number): Promise<any> {
  return new Promise((resolve, reject) => {
    const socket = tls.connect(port, host, { rejectUnauthorized: false }, () => {
      const cert = socket.getPeerCertificate();
      socket.end();
      resolve(cert);
    });
    socket.on('error', reject);
  });
}

3.1.6 Encryption Key Rotation Alerts

Alert Rule: Encryption Key Rotation Overdue

Parameter	Value
Alert Name	`ENCRYPTION_KEY_ROTATION_OVERDUE`
Trigger Condition	Key has not been rotated in >90 days (policy: rotate every 90 days)
Severity	HIGH
Data Source	Cloud KMS key version creation timestamps
Detection Method	Scheduled query comparing last rotation date to policy threshold
Response SLA	48 hours
Escalation	Security team → CISO
Auto-Remediation	Trigger key rotation workflow (manual approval required)

Key Rotation Monitoring Query:

-- BigQuery scheduled query (runs daily)
SELECT
  key_ring_name,
  crypto_key_name,
  primary_version_id,
  primary_version_created_time,
  TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), primary_version_created_time, DAY) AS days_since_rotation,
  CASE
    WHEN TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), primary_version_created_time, DAY) > 90 THEN 'OVERDUE'
    WHEN TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), primary_version_created_time, DAY) > 75 THEN 'WARNING'
    ELSE 'COMPLIANT'
  END AS rotation_status
FROM `bio-qms-prod.kms_inventory.keys`
WHERE key_purpose IN ('ENCRYPT_DECRYPT', 'ASYMMETRIC_SIGN')
  AND rotation_status IN ('OVERDUE', 'WARNING')
ORDER BY days_since_rotation DESC;

3.2 Alert Channels and Escalation Matrix

3.2.1 Notification Channels Configuration

PagerDuty Integration:

# terraform/monitoring.tf

resource "google_monitoring_notification_channel" "pagerduty_security" {
  display_name = "PagerDuty - Security Team"
  type         = "pagerduty"

  labels = {
    service_key = var.pagerduty_security_service_key
  }

  enabled = true
}

resource "google_monitoring_notification_channel" "pagerduty_oncall" {
  display_name = "PagerDuty - On-Call SRE"
  type         = "pagerduty"

  labels = {
    service_key = var.pagerduty_oncall_service_key
  }

  enabled = true
}

Slack Integration:

resource "google_monitoring_notification_channel" "slack_security" {
  display_name = "Slack - #security-alerts"
  type         = "slack"

  labels = {
    channel_name = "#security-alerts"
    url          = var.slack_webhook_security
  }

  enabled = true
}

resource "google_monitoring_notification_channel" "slack_incidents" {
  display_name = "Slack - #incident-response"
  type         = "slack"

  labels = {
    channel_name = "#incident-response"
    url          = var.slack_webhook_incidents
  }

  enabled = true
}

resource "google_monitoring_notification_channel" "slack_compliance" {
  display_name = "Slack - #compliance-alerts"
  type         = "slack"

  labels = {
    channel_name = "#compliance-alerts"
    url          = var.slack_webhook_compliance
  }

  enabled = true
}

Email Notification Channels:

resource "google_monitoring_notification_channel" "email_ciso" {
  display_name = "Email - CISO"
  type         = "email"

  labels = {
    email_address = var.ciso_email
  }

  enabled = true
}

resource "google_monitoring_notification_channel" "email_privacy_officer" {
  display_name = "Email - Privacy Officer"
  type         = "email"

  labels = {
    email_address = var.privacy_officer_email
  }

  enabled = true
}

resource "google_monitoring_notification_channel" "email_compliance_team" {
  display_name = "Email - Compliance Team"
  type         = "email"

  labels = {
    email_address = "compliance@coditect.ai"
  }

  enabled = true
}

3.2.2 Escalation Matrix with Response SLAs

Alert Type	Severity	Initial Notification	Response SLA	Escalation (if unresolved)	Escalation Trigger
Failed Authentication (5+ attempts)	HIGH	Security team (Slack + email)	15 minutes	CISO	30 minutes
Privilege Escalation	CRITICAL	Security team (PagerDuty + Slack)	5 minutes	CISO + VP Engineering	Immediate
PHI Access Anomaly	HIGH	Privacy Officer (email + Slack)	30 minutes	CISO + Privacy Officer	2 hours
System Availability <99.9%	CRITICAL	On-call SRE (PagerDuty)	5 minutes	VP Engineering	15 minutes
TLS Certificate Expiring (90d)	MEDIUM	Security team (email)	7 days	CISO	60 days
TLS Certificate Expiring (30d)	HIGH	Security team (Slack + email)	24 hours	CISO	7 days
TLS Certificate Expiring (7d)	CRITICAL	Security team (PagerDuty + Slack + email)	4 hours	CISO + VP Engineering	24 hours
Key Rotation Overdue	HIGH	Security team (email)	48 hours	CISO	7 days
Audit Log Tampering	CRITICAL	CISO + Security team (PagerDuty + email)	Immediate	CEO + General Counsel	Immediate
Backup Failure	HIGH	On-call SRE (Slack + email)	2 hours	VP Engineering	24 hours
Vulnerability Scan Critical Finding	CRITICAL	Security team (Slack + email)	24 hours	CISO	72 hours
Control Test Failure	MEDIUM	Compliance team (email)	48 hours	CCO	7 days

3.2.3 Alert Suppression and Maintenance Windows

Maintenance Window Configuration:

# terraform/monitoring.tf

# Define maintenance window for scheduled system updates
resource "google_monitoring_uptime_check_config" "maintenance_window" {
  display_name = "Maintenance Window (Sundays 2-4 AM UTC)"

  # This is a synthetic check that always returns true during maintenance
  # Used to suppress alerts during planned downtime
}

# Alert policy with maintenance window exclusion
resource "google_monitoring_alert_policy" "uptime_with_maintenance" {
  display_name = "System Availability (excluding maintenance)"
  combiner     = "OR"

  conditions {
    display_name = "Uptime check failed outside maintenance window"

    condition_threshold {
      filter = <<-EOT
        metric.type="monitoring.googleapis.com/uptime_check/check_passed"
        AND resource.type="uptime_url"
        AND metric.label.check_id!="maintenance_window"
      EOT

      duration        = "300s"
      comparison      = "COMPARISON_LT"
      threshold_value = 1

      aggregations {
        alignment_period     = "60s"
        per_series_aligner  = "ALIGN_FRACTION_TRUE"
      }
    }
  }

  notification_channels = [
    google_monitoring_notification_channel.pagerduty_oncall.name
  ]
}

Alert Suppression API:

// src/monitoring/alert-suppression.service.ts
import { Injectable } from '@nestjs/common';
import { InjectRepository } from '@nestjs/typeorm';
import { Repository } from 'typeorm';
import { AlertSuppression } from './entities/alert-suppression.entity';

@Injectable()
export class AlertSuppressionService {
  constructor(
    @InjectRepository(AlertSuppression)
    private suppressionRepo: Repository<AlertSuppression>
  ) {}

  async createMaintenanceWindow(data: {
    reason: string;
    start_time: Date;
    end_time: Date;
    affected_services: string[];
    created_by: string;
    approval_ticket?: string;
  }): Promise<AlertSuppression> {
    const suppression = this.suppressionRepo.create({
      ...data,
      suppression_type: 'MAINTENANCE_WINDOW',
      status: 'ACTIVE'
    });

    await this.suppressionRepo.save(suppression);

    // Notify monitoring system
    await this.notifyMonitoringSystem({
      action: 'CREATE_SUPPRESSION',
      suppression_id: suppression.id,
      start_time: data.start_time,
      end_time: data.end_time,
      services: data.affected_services
    });

    return suppression;
  }

  async isAlertSuppressed(alertType: string, service: string): Promise<boolean> {
    const now = new Date();

    const activeSuppressions = await this.suppressionRepo.find({
      where: {
        status: 'ACTIVE',
        affected_services: Like(`%${service}%`)
      }
    });

    return activeSuppressions.some(s =>
      s.start_time <= now && s.end_time >= now
    );
  }
}

4. SOC 2 Control Effectiveness Dashboard

4.1 Per-TSC Control Status Tracking

4.1.1 Control Inventory and Mapping

SOC 2 TSC Control Matrix:

TSC Category	Control ID	Control Description	Automated Test	Manual Test	Evidence Source
CC1.1	CC1.1	COTS selection criteria documented	No	Quarterly	Document review
CC1.2	CC1.2	Management establishes structures	No	Annual	Org chart, policies
CC2.1	CC2.1	Communication of info security policies	No	Annual	Training records
CC2.2	CC2.2	Internal communication channels	No	Annual	Meeting minutes
CC3.1	CC3.1	Risk assessment process	No	Semi-annual	Risk register
CC3.2	CC3.2	Risk mitigation controls	Partial	Quarterly	Control tests
CC4.1	CC4.1	Change management program	Yes	Quarterly	Git logs, change tickets
CC5.1	CC5.1	Logical access controls	Yes	Monthly	Access logs, RBAC config
CC5.2	CC5.2	New user provisioning	Yes	Monthly	User creation audit logs
CC5.3	CC5.3	User termination process	Yes	Monthly	Deactivation audit logs
CC6.1	CC6.1	Encryption of data at rest	Yes	Quarterly	KMS config, encryption verification
CC6.2	CC6.2	Encryption of data in transit	Yes	Daily	TLS certificate validation
CC6.3	CC6.3	Network segmentation	Yes	Monthly	Firewall rules, VPC config
CC6.4	CC6.4	Intrusion detection	Yes	Monthly	IDS alert logs
CC6.5	CC6.5	Vulnerability management	Yes	Weekly	Vulnerability scan results
CC6.6	CC6.6	Malware protection	Yes	Daily	Antivirus status
CC7.1	CC7.1	System monitoring	Yes	Daily	Monitoring dashboard uptime
CC7.2	CC7.2	Incident detection/response	Yes	Monthly	Incident ticket count, MTTR
CC7.3	CC7.3	Security incident procedures	No	Annual	Runbook review, tabletop exercise
CC7.4	CC7.4	Change detection	Yes	Daily	Config drift detection
CC8.1	CC8.1	Change authorization	Yes	Daily	Git commit approvals
CC8.2	CC8.2	System changes logged	Yes	Daily	Deployment logs
CC9.1	CC9.1	Vendor risk assessments	No	Annual	Vendor review docs
CC9.2	CC9.2	Vendor contract review	No	Annual	Contract review checklist
A1.1	A1.1	Availability monitoring	Yes	Daily	Uptime check results
A1.2	A1.2	Backup and recovery	Yes	Weekly	Backup verification logs
A1.3	A1.3	Disaster recovery plan	No	Annual	DR test results
PI1.1	PI1.1	Quality assurance processes	Partial	Quarterly	Test results, code coverage
PI1.2	PI1.2	Authorized processing	Yes	Monthly	Audit logs for data processing
C1.1	C1.1	Confidentiality commitments	No	Annual	NDA tracking, BAA tracking
C1.2	C1.2	Confidential data disposal	Yes	Monthly	Secure deletion audit logs

Control Status Database Schema:

-- BigQuery table for control test results
CREATE TABLE IF NOT EXISTS `bio-qms-prod.compliance.control_test_results` (
  test_id STRING NOT NULL,
  control_id STRING NOT NULL,
  tsc_category STRING NOT NULL,
  test_type STRING NOT NULL,  -- "AUTOMATED" or "MANUAL"
  test_date TIMESTAMP NOT NULL,
  test_status STRING NOT NULL,  -- "PASS", "FAIL", "PARTIAL", "NOT_TESTED"
  test_method STRING,  -- Description of how test was performed
  evidence_location STRING,  -- Path to evidence (GCS bucket, wiki page, etc.)
  tested_by STRING,  -- User who performed test (for manual) or "SYSTEM" (for automated)
  findings STRING,  -- Any issues or observations
  remediation_plan STRING,  -- If FAIL, what is the remediation
  remediation_due_date DATE,
  remediation_status STRING,  -- "PENDING", "IN_PROGRESS", "COMPLETE", "ACCEPTED_RISK"

  -- Evidence metadata
  evidence_files ARRAY<STRING>,  -- List of evidence file paths
  evidence_hash STRING,  -- SHA-256 hash for tamper detection

  -- Audit trail
  created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP(),
  updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP()
)
PARTITION BY DATE(test_date)
CLUSTER BY control_id, test_status;

4.1.2 Automated Control Test Execution

Automated Control Test Scheduler (Cloud Scheduler + Cloud Functions):

// cloud-functions/control-tests/index.ts
import { BigQuery } from '@google-cloud/bigquery';
import * as controlTests from './tests';

export async function runScheduledControlTests() {
  const bigquery = new BigQuery();
  const results = [];

  // Daily tests
  const dailyTests = [
    { control_id: 'CC6.2', test_fn: controlTests.testTLSEncryption },
    { control_id: 'CC6.6', test_fn: controlTests.testMalwareProtection },
    { control_id: 'CC7.1', test_fn: controlTests.testSystemMonitoring },
    { control_id: 'CC7.4', test_fn: controlTests.testConfigDriftDetection },
    { control_id: 'CC8.1', test_fn: controlTests.testChangeAuthorization },
    { control_id: 'CC8.2', test_fn: controlTests.testChangeLogging },
    { control_id: 'A1.1', test_fn: controlTests.testAvailabilityMonitoring }
  ];

  for (const test of dailyTests) {
    const result = await executeControlTest(test.control_id, test.test_fn);
    results.push(result);

    // Store result in BigQuery
    await bigquery.dataset('compliance').table('control_test_results').insert([result]);

    // Alert if test failed
    if (result.test_status === 'FAIL') {
      await sendControlFailureAlert(result);
    }
  }

  return { tested: results.length, passed: results.filter(r => r.test_status === 'PASS').length };
}

async function executeControlTest(controlId: string, testFn: Function) {
  const testId = `test_${controlId}_${Date.now()}`;
  const startTime = Date.now();

  try {
    const testResult = await testFn();

    return {
      test_id: testId,
      control_id: controlId,
      tsc_category: controlId.split('.')[0],  // Extract "CC6" from "CC6.2"
      test_type: 'AUTOMATED',
      test_date: new Date().toISOString(),
      test_status: testResult.passed ? 'PASS' : 'FAIL',
      test_method: testResult.method,
      evidence_location: testResult.evidence_path,
      tested_by: 'SYSTEM',
      findings: testResult.findings,
      execution_time_ms: Date.now() - startTime
    };
  } catch (error) {
    return {
      test_id: testId,
      control_id: controlId,
      tsc_category: controlId.split('.')[0],
      test_type: 'AUTOMATED',
      test_date: new Date().toISOString(),
      test_status: 'FAIL',
      test_method: 'Automated script execution',
      tested_by: 'SYSTEM',
      findings: `Test execution error: ${error.message}`,
      execution_time_ms: Date.now() - startTime
    };
  }
}

Example Automated Control Tests:

// cloud-functions/control-tests/tests/cc6-2-tls-encryption.ts

export async function testTLSEncryption() {
  const endpoints = [
    'https://bio-qms.coditect.ai',
    'https://api.bio-qms.coditect.ai',
    'https://admin.bio-qms.coditect.ai'
  ];

  const results = [];
  let allPassed = true;

  for (const endpoint of endpoints) {
    const tlsVersion = await getTLSVersion(endpoint);
    const cipherSuite = await getCipherSuite(endpoint);

    const passed = tlsVersion === 'TLSv1.3' && isCipherSuiteApproved(cipherSuite);
    allPassed = allPassed && passed;

    results.push({
      endpoint,
      tls_version: tlsVersion,
      cipher_suite: cipherSuite,
      passed
    });
  }

  return {
    passed: allPassed,
    method: 'SSL/TLS handshake analysis using Node.js tls module',
    evidence_path: await saveEvidenceToGCS('tls-encryption-test', results),
    findings: allPassed
      ? 'All endpoints using TLS 1.3 with approved cipher suites'
      : 'Some endpoints using weak TLS configuration: ' + JSON.stringify(results.filter(r => !r.passed))
  };
}

// cloud-functions/control-tests/tests/cc5-1-logical-access.ts

export async function testLogicalAccessControls() {
  const bigquery = new BigQuery();

  // Test 1: No users with overly broad permissions
  const [broadPermUsers] = await bigquery.query(`
    SELECT user_id, role, permissions
    FROM \`bio-qms-prod.auth.user_roles\`
    WHERE role = 'SUPER_ADMIN' AND user_id NOT IN (
      SELECT user_id FROM \`bio-qms-prod.auth.approved_admins\`
    )
  `);

  // Test 2: No inactive users with active sessions
  const [inactiveWithSessions] = await bigquery.query(`
    SELECT s.user_id, u.status, s.last_activity
    FROM \`bio-qms-prod.auth.sessions\` s
    JOIN \`bio-qms-prod.auth.users\` u ON s.user_id = u.user_id
    WHERE u.status = 'INACTIVE' AND s.expires_at > CURRENT_TIMESTAMP()
  `);

  // Test 3: No users without MFA enabled (policy requirement)
  const [usersWithoutMFA] = await bigquery.query(`
    SELECT user_id, email, role
    FROM \`bio-qms-prod.auth.users\`
    WHERE status = 'ACTIVE' AND mfa_enabled = false
  `);

  const allPassed = broadPermUsers.length === 0
    && inactiveWithSessions.length === 0
    && usersWithoutMFA.length === 0;

  return {
    passed: allPassed,
    method: 'BigQuery analysis of user access configuration',
    evidence_path: await saveEvidenceToGCS('logical-access-test', {
      broad_perm_users: broadPermUsers,
      inactive_sessions: inactiveWithSessions,
      users_without_mfa: usersWithoutMFA
    }),
    findings: allPassed
      ? 'All logical access controls operating as designed'
      : `Control failures: ${broadPermUsers.length} unauthorized admins, ${inactiveWithSessions.length} inactive with sessions, ${usersWithoutMFA.length} without MFA`
  };
}

4.2 Compliance KPI Dashboard

4.2.1 Key Performance Indicators

SOC 2 Compliance KPIs:

KPI	Definition	Target	Measurement Frequency	Data Source
Control Test Pass Rate	% of control tests passed / total tests	≥95%	Weekly	BigQuery: `control_test_results`
Evidence Coverage	% of controls with current evidence / total controls	100%	Weekly	BigQuery: control evidence inventory
Mean Time to Remediate (MTTR)	Average days from control failure to remediation complete	≤14 days	Monthly	BigQuery: remediation tracking
Control Failure Rate	# of control failures per month	≤2 per month	Monthly	BigQuery: control test failures
Evidence Freshness	% of evidence <90 days old	≥80%	Monthly	BigQuery: evidence timestamps
Risk Score Trend	Aggregate risk score (weighted by control criticality)	≤20 (Low risk)	Weekly	BigQuery: risk calculation
Audit Readiness Score	Weighted score based on control status + evidence coverage	≥90%	Weekly	BigQuery: composite calculation
Security Event Response Time	Average time from alert to acknowledgment	≤15 minutes	Daily	BigQuery: alert acknowledgment logs
Change Management Compliance	% of changes with proper approval / total changes	100%	Weekly	Git logs + change ticket system
Availability SLA Achievement	% of time system meets 99.9% uptime SLA	≥99.9%	Daily	Cloud Monitoring uptime checks

4.2.2 Dashboard Implementation (Looker Studio)

Dashboard Layout:

┌──────────────────────────────────────────────────────────────────┐
│                    SOC 2 CONTROL EFFECTIVENESS                    │
│                  Real-Time Compliance Dashboard                   │
├──────────────────────────────────────────────────────────────────┤
│                                                                   │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌──────────┐│
│  │ Audit Ready │  │ Evidence    │  │ Control     │  │ Risk     ││
│  │   Score     │  │  Coverage   │  │  Pass Rate  │  │  Score   ││
│  │             │  │             │  │             │  │          ││
│  │    92%      │  │    97%      │  │    96%      │  │   18     ││
│  │   🟢 GOOD   │  │   🟢 GOOD   │  │   🟢 GOOD   │  │  🟢 LOW  ││
│  └─────────────┘  └─────────────┘  └─────────────┘  └──────────┘│
│                                                                   │
├──────────────────────────────────────────────────────────────────┤
│  Control Test Results by TSC Category (Last 30 Days)             │
│  ┌──────────────────────────────────────────────────────────────┐│
│  │ CC1 ████████████████████████████████████████  40/40 (100%)  ││
│  │ CC2 ████████████████████████████████████████  30/30 (100%)  ││
│  │ CC3 ███████████████████████████████████████░   28/30 (93%)  ││
│  │ CC4 ████████████████████████████████████████  25/25 (100%)  ││
│  │ CC5 ███████████████████████████████████████░   57/60 (95%)  ││
│  │ CC6 ████████████████████████████████████████  72/75 (96%)  ││
│  │ CC7 ████████████████████████████████████████  45/45 (100%)  ││
│  │ CC8 ████████████████████████████████████████  30/30 (100%)  ││
│  │ CC9 ████████████████████████████████████████  20/20 (100%)  ││
│  │ A1  ███████████████████████████████████████░   38/40 (95%)  ││
│  │ PI1 ████████████████████████████████████████  15/15 (100%)  ││
│  │ C1  ████████████████████████████████████████  20/20 (100%)  ││
│  └──────────────────────────────────────────────────────────────┘│
│                                                                   │
├──────────────────────────────────────────────────────────────────┤
│  Recent Control Failures (Requires Attention)                    │
│  ┌──────────────────────────────────────────────────────────────┐│
│  │ Control  │ Last Test │ Status │ Finding          │ Due Date  ││
│  ├──────────┼───────────┼────────┼──────────────────┼───────────┤│
│  │ CC5.3    │ 2/14/2026 │ FAIL   │ 2 inactive users │ 2/21/2026 ││
│  │          │           │        │ with active sess │           ││
│  │ CC3.2    │ 2/13/2026 │ PARTIAL│ 1 medium risk    │ 2/27/2026 ││
│  │          │           │        │ not mitigated    │           ││
│  │ A1.2     │ 2/12/2026 │ FAIL   │ Backup failed on │ 2/19/2026 ││
│  │          │           │        │ 2/11 (resolved)  │           ││
│  └──────────────────────────────────────────────────────────────┘│
│                                                                   │
├──────────────────────────────────────────────────────────────────┤
│  Risk Score Trending (30/60/90 Day Views)                        │
│  ┌──────────────────────────────────────────────────────────────┐│
│  │  Risk                                                         ││
│  │  Score                                                        ││
│  │   30│                                            ┌──Target    ││
│  │     │                                            │   (≤20)    ││
│  │   25│                                      ●─────┘            ││
│  │     │                                    ●                    ││
│  │   20│────────────────────────────────●──────────────────────●││
│  │     │                            ●                            ││
│  │   15│                      ●                                  ││
│  │     │                ●                                        ││
│  │   10│          ●                                              ││
│  │     └────────────────────────────────────────────────────────┤│
│  │       30d ago    20d        10d        Today                 ││
│  └──────────────────────────────────────────────────────────────┘│
│                                                                   │
├──────────────────────────────────────────────────────────────────┤
│  Mean Time to Remediate (MTTR) - Last 12 Months                  │
│  ┌──────────────────────────────────────────────────────────────┐│
│  │  Days                                      ┌──Target (≤14d)   ││
│  │   20│                                      │                  ││
│  │     │                                      │                  ││
│  │   15│─────────────────────────────────────┼──────────────────││
│  │     │    ●                           ●     │     ●            ││
│  │   10│       ●     ●           ●  ●      ●     ●     ●     ●  ││
│  │     │          ●     ●     ●                                  ││
│  │    5│                                                         ││
│  │     └────────────────────────────────────────────────────────┤│
│  │       J  F  M  A  M  J  J  A  S  O  N  D  J  F              ││
│  └──────────────────────────────────────────────────────────────┘│
│    Average MTTR: 11.3 days ✓ (Target: ≤14 days)                  │
│                                                                   │
└──────────────────────────────────────────────────────────────────┘

Looker Studio Data Source Configuration:

-- View for Audit Readiness Score
CREATE OR REPLACE VIEW `bio-qms-prod.compliance.v_audit_readiness` AS
SELECT
  CURRENT_TIMESTAMP() AS snapshot_time,

  -- Control test pass rate (30-day window)
  SAFE_DIVIDE(
    COUNTIF(test_status = 'PASS'),
    COUNT(*)
  ) * 100 AS control_pass_rate_pct,

  -- Evidence coverage
  (
    SELECT SAFE_DIVIDE(
      COUNT(DISTINCT control_id),
      (SELECT COUNT(*) FROM `bio-qms-prod.compliance.control_inventory`)
    ) * 100
    FROM `bio-qms-prod.compliance.control_test_results`
    WHERE test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
  ) AS evidence_coverage_pct,

  -- Weighted audit readiness score
  (
    (SAFE_DIVIDE(COUNTIF(test_status = 'PASS'), COUNT(*)) * 0.6) +  -- 60% weight on test results
    (
      (SELECT SAFE_DIVIDE(
        COUNT(DISTINCT control_id),
        (SELECT COUNT(*) FROM `bio-qms-prod.compliance.control_inventory`)
      ))
      * 0.4  -- 40% weight on evidence coverage
    )
  ) * 100 AS audit_readiness_score

FROM `bio-qms-prod.compliance.control_test_results`
WHERE test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY);

-- View for Risk Score Calculation
CREATE OR REPLACE VIEW `bio-qms-prod.compliance.v_risk_score` AS
WITH control_failures AS (
  SELECT
    control_id,
    MAX(test_date) AS last_test_date,
    COUNTIF(test_status = 'FAIL') AS failure_count,
    DATE_DIFF(CURRENT_DATE(), MAX(DATE(test_date)), DAY) AS days_since_test
  FROM `bio-qms-prod.compliance.control_test_results`
  WHERE test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
  GROUP BY control_id
)
SELECT
  CURRENT_TIMESTAMP() AS snapshot_time,
  SUM(
    CASE
      -- Critical controls (CC6.x encryption) have 3x weight
      WHEN cf.control_id LIKE 'CC6.%' THEN cf.failure_count * 3
      -- High controls (CC5.x access, CC7.x monitoring) have 2x weight
      WHEN cf.control_id LIKE 'CC5.%' OR cf.control_id LIKE 'CC7.%' THEN cf.failure_count * 2
      -- Standard controls have 1x weight
      ELSE cf.failure_count
    END
  ) AS weighted_risk_score,

  -- Risk level classification
  CASE
    WHEN SUM(cf.failure_count) = 0 THEN 'MINIMAL'
    WHEN SUM(cf.failure_count) <= 3 THEN 'LOW'
    WHEN SUM(cf.failure_count) <= 7 THEN 'MODERATE'
    WHEN SUM(cf.failure_count) <= 15 THEN 'HIGH'
    ELSE 'CRITICAL'
  END AS risk_level

FROM control_failures cf;

-- View for MTTR Calculation
CREATE OR REPLACE VIEW `bio-qms-prod.compliance.v_mttr` AS
SELECT
  DATE_TRUNC(test_date, MONTH) AS month,
  AVG(
    DATE_DIFF(
      DATE(remediation_complete_date),
      DATE(test_date),
      DAY
    )
  ) AS avg_mttr_days,
  COUNT(*) AS total_remediations
FROM `bio-qms-prod.compliance.control_test_results`
WHERE test_status = 'FAIL'
  AND remediation_status = 'COMPLETE'
  AND test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 12 MONTH)
GROUP BY month
ORDER BY month DESC;

5. Automated Health Checks

5.1 Daily Health Check Schedule

Daily Automated Checks (0800 UTC):

// cloud-functions/daily-health-checks/index.ts

export async function runDailyHealthChecks() {
  const results = {
    timestamp: new Date().toISOString(),
    checks: []
  };

  // 1. Encryption Status Verification
  results.checks.push(await verifyEncryptionStatus());

  // 2. Access Review (inactive users, orphaned permissions)
  results.checks.push(await reviewAccessControls());

  // 3. Log Integrity Verification
  results.checks.push(await verifyLogIntegrity());

  // 4. TLS Certificate Expiration Check
  results.checks.push(await checkCertificateExpiration());

  // 5. System Availability Check
  results.checks.push(await checkSystemAvailability());

  // 6. Configuration Drift Detection
  results.checks.push(await detectConfigurationDrift());

  // Store results
  await storeHealthCheckResults(results);

  // Alert on any failures
  const failures = results.checks.filter(c => c.status === 'FAIL');
  if (failures.length > 0) {
    await sendHealthCheckAlert(failures);
  }

  return results;
}

async function verifyEncryptionStatus() {
  // Check that all PHI tables have encryption enabled
  const bigquery = new BigQuery();
  const [tables] = await bigquery.dataset('production').getTables();

  const unencryptedTables = [];
  for (const table of tables) {
    const [metadata] = await table.getMetadata();
    if (!metadata.encryptionConfiguration) {
      unencryptedTables.push(table.id);
    }
  }

  // Check Cloud SQL encryption
  const sqlEncrypted = await checkCloudSQLEncryption();

  // Check KMS key accessibility
  const kmsKeysAccessible = await checkKMSKeys();

  const passed = unencryptedTables.length === 0 && sqlEncrypted && kmsKeysAccessible;

  return {
    check_name: 'ENCRYPTION_STATUS',
    status: passed ? 'PASS' : 'FAIL',
    findings: {
      unencrypted_tables: unencryptedTables,
      sql_encrypted: sqlEncrypted,
      kms_keys_accessible: kmsKeysAccessible
    },
    evidence_path: await saveEvidenceToGCS('encryption-status', { unencryptedTables, sqlEncrypted, kmsKeysAccessible })
  };
}

async function reviewAccessControls() {
  const bigquery = new BigQuery();

  // Check for inactive users with active permissions
  const [inactiveUsers] = await bigquery.query(`
    SELECT user_id, email, last_login, status
    FROM \`bio-qms-prod.auth.users\`
    WHERE status = 'INACTIVE'
      AND user_id IN (
        SELECT DISTINCT user_id FROM \`bio-qms-prod.auth.user_roles\`
      )
  `);

  // Check for users who haven't logged in for 90+ days
  const [dormantUsers] = await bigquery.query(`
    SELECT user_id, email, last_login
    FROM \`bio-qms-prod.auth.users\`
    WHERE status = 'ACTIVE'
      AND last_login < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
  `);

  // Check for orphaned service accounts
  const [orphanedServiceAccounts] = await bigquery.query(`
    SELECT service_account_id, created_at
    FROM \`bio-qms-prod.auth.service_accounts\`
    WHERE last_used < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 180 DAY)
  `);

  const passed = inactiveUsers.length === 0
    && dormantUsers.length === 0
    && orphanedServiceAccounts.length === 0;

  return {
    check_name: 'ACCESS_CONTROLS_REVIEW',
    status: passed ? 'PASS' : 'FAIL',
    findings: {
      inactive_users_with_perms: inactiveUsers.length,
      dormant_users: dormantUsers.length,
      orphaned_service_accounts: orphanedServiceAccounts.length
    },
    evidence_path: await saveEvidenceToGCS('access-review', { inactiveUsers, dormantUsers, orphanedServiceAccounts })
  };
}

5.2 Weekly Health Check Schedule

Weekly Automated Checks (Sundays 0800 UTC):

// cloud-functions/weekly-health-checks/index.ts

export async function runWeeklyHealthChecks() {
  const results = {
    timestamp: new Date().toISOString(),
    checks: []
  };

  // 1. Vulnerability Scan Results Review
  results.checks.push(await reviewVulnerabilityScanResults());

  // 2. Patch Compliance Check
  results.checks.push(await checkPatchCompliance());

  // 3. Backup Verification
  results.checks.push(await verifyBackups());

  // 4. Access Log Anomaly Review
  results.checks.push(await reviewAccessLogAnomalies());

  // 5. Cloud Resource Inventory
  results.checks.push(await inventoryCloudResources());

  // Store results
  await storeHealthCheckResults(results);

  const failures = results.checks.filter(c => c.status === 'FAIL');
  if (failures.length > 0) {
    await sendHealthCheckAlert(failures);
  }

  return results;
}

async function reviewVulnerabilityScanResults() {
  const bigquery = new BigQuery();

  // Query vulnerability scan results from last 7 days
  const [vulns] = await bigquery.query(`
    SELECT
      severity,
      COUNT(*) AS count
    FROM \`bio-qms-prod.security.vulnerability_scans\`
    WHERE scan_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
    GROUP BY severity
  `);

  const critical = vulns.find(v => v.severity === 'CRITICAL')?.count || 0;
  const high = vulns.find(v => v.severity === 'HIGH')?.count || 0;

  // Policy: Zero critical vulns, < 5 high vulns
  const passed = critical === 0 && high < 5;

  return {
    check_name: 'VULNERABILITY_SCAN_REVIEW',
    status: passed ? 'PASS' : 'FAIL',
    findings: {
      critical_vulns: critical,
      high_vulns: high,
      threshold_critical: 0,
      threshold_high: 5
    },
    evidence_path: await saveEvidenceToGCS('vuln-scan-results', vulns)
  };
}

async function verifyBackups() {
  const bigquery = new BigQuery();

  // Check database backups in last 24 hours
  const [dbBackups] = await bigquery.query(`
    SELECT
      backup_id,
      backup_time,
      status,
      backup_size_gb
    FROM \`bio-qms-prod.backups.database_backups\`
    WHERE backup_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
    ORDER BY backup_time DESC
    LIMIT 1
  `);

  // Verify backup integrity (random restore test)
  const backupValid = dbBackups.length > 0 && dbBackups[0].status === 'SUCCESS';

  // Check file storage backups
  const storageBackupValid = await verifyStorageBackups();

  const passed = backupValid && storageBackupValid;

  return {
    check_name: 'BACKUP_VERIFICATION',
    status: passed ? 'PASS' : 'FAIL',
    findings: {
      latest_db_backup: dbBackups[0],
      storage_backup_valid: storageBackupValid
    },
    evidence_path: await saveEvidenceToGCS('backup-verification', { dbBackups, storageBackupValid })
  };
}

5.3 Monthly Health Check Schedule

Monthly Automated Checks (1st of month, 0800 UTC):

// cloud-functions/monthly-health-checks/index.ts

export async function runMonthlyHealthChecks() {
  const results = {
    timestamp: new Date().toISOString(),
    checks: []
  };

  // 1. Full Control Assessment (all automated controls)
  results.checks.push(await runFullControlAssessment());

  // 2. Evidence Snapshot (capture all current evidence)
  results.checks.push(await captureEvidenceSnapshot());

  // 3. Risk Reassessment (recalculate risk scores)
  results.checks.push(await reassessRisks());

  // 4. User Access Recertification Trigger
  results.checks.push(await triggerAccessRecertification());

  // 5. Vendor Security Review Status
  results.checks.push(await checkVendorSecurityReviews());

  // Store results
  await storeHealthCheckResults(results);

  // Generate monthly compliance report
  await generateMonthlyComplianceReport(results);

  return results;
}

async function runFullControlAssessment() {
  // Execute all automated control tests
  const controlTests = await import('../control-tests');
  const testResults = [];

  for (const [controlId, testFn] of Object.entries(controlTests)) {
    if (typeof testFn === 'function') {
      const result = await executeControlTest(controlId, testFn);
      testResults.push(result);
    }
  }

  const passRate = testResults.filter(r => r.test_status === 'PASS').length / testResults.length;

  return {
    check_name: 'FULL_CONTROL_ASSESSMENT',
    status: passRate >= 0.95 ? 'PASS' : 'FAIL',
    findings: {
      total_controls_tested: testResults.length,
      passed: testResults.filter(r => r.test_status === 'PASS').length,
      failed: testResults.filter(r => r.test_status === 'FAIL').length,
      pass_rate: passRate
    },
    evidence_path: await saveEvidenceToGCS('monthly-control-assessment', testResults)
  };
}

async function captureEvidenceSnapshot() {
  // Create point-in-time snapshot of all compliance evidence
  const evidenceSnapshot = {
    snapshot_date: new Date().toISOString(),
    control_test_results: await queryBigQuery('SELECT * FROM control_test_results WHERE test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)'),
    audit_logs_summary: await queryBigQuery('SELECT COUNT(*) as count, audit_event_type FROM audit_logs.entries WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) GROUP BY audit_event_type'),
    user_access_snapshot: await queryBigQuery('SELECT user_id, role, permissions FROM auth.user_roles'),
    system_config_snapshot: await captureSystemConfiguration(),
    vulnerability_status: await queryBigQuery('SELECT * FROM security.vulnerability_scans ORDER BY scan_date DESC LIMIT 1')
  };

  // Store snapshot in GCS with tamper-evident hash
  const snapshotPath = await saveEvidenceSnapshotToGCS(evidenceSnapshot);

  return {
    check_name: 'EVIDENCE_SNAPSHOT',
    status: 'PASS',
    findings: {
      snapshot_path: snapshotPath,
      evidence_types_captured: Object.keys(evidenceSnapshot).length
    },
    evidence_path: snapshotPath
  };
}

6. Evidence Collection and Audit Preparation

6.1 Automated Evidence Collection

Evidence Collection Strategy:

Evidence Type	Collection Method	Frequency	Retention	Storage Location
Control Test Results	Automated test execution → BigQuery	Daily/Weekly/Monthly	2 years	`compliance.control_test_results`
Audit Logs	Real-time streaming → BigQuery → GCS	Continuous	7 years	`audit_logs.*` + GCS archive
Access Control Config	Daily snapshot	Daily	1 year	`auth.*` tables
System Configuration	Terraform state + GCP API	Weekly	1 year	GCS bucket
Vulnerability Scans	Scheduled scans	Weekly	1 year	`security.vulnerability_scans`
Change Logs	Git commits + deployment logs	Real-time	2 years	GitHub + `deployments.*`
Backup Verification	Backup status API	Daily	90 days	`backups.*`
Uptime Check Results	Cloud Monitoring API	Continuous	1 year	Cloud Monitoring (exported to BigQuery)
TLS Certificate Status	Daily certificate check	Daily	1 year	`security.certificate_checks`
Training Records	LMS integration	Real-time	3 years	`hr.training_records`
Incident Response Tickets	Jira/GitHub Issues	Real-time	3 years	External system + exports
Risk Assessments	Manual entry + review	Quarterly	3 years	`compliance.risk_assessments`

6.2 SOC 2 Audit Evidence Package

Evidence Package Generator:

// cloud-functions/audit-evidence-package/index.ts

export async function generateSOC2EvidencePackage(params: {
  audit_period_start: string;
  audit_period_end: string;
  auditor_name: string;
  audit_firm: string;
}) {
  const evidencePackage = {
    metadata: {
      generated_at: new Date().toISOString(),
      audit_period_start: params.audit_period_start,
      audit_period_end: params.audit_period_end,
      auditor_name: params.auditor_name,
      audit_firm: params.audit_firm,
      package_version: '1.0.0'
    },
    evidence: {}
  };

  // Section 1: Control Test Results (all TSC controls)
  evidencePackage.evidence.control_test_results = await collectControlTestResults(
    params.audit_period_start,
    params.audit_period_end
  );

  // Section 2: Audit Logs (PHI access, user actions, system changes)
  evidencePackage.evidence.audit_logs = await collectAuditLogsSummary(
    params.audit_period_start,
    params.audit_period_end
  );

  // Section 3: Access Control Evidence
  evidencePackage.evidence.access_controls = {
    user_provisioning: await collectUserProvisioningLogs(params.audit_period_start, params.audit_period_end),
    user_termination: await collectUserTerminationLogs(params.audit_period_start, params.audit_period_end),
    access_reviews: await collectAccessReviewRecords(params.audit_period_start, params.audit_period_end),
    rbac_configuration: await captureRBACConfiguration()
  };

  // Section 4: Encryption Evidence
  evidencePackage.evidence.encryption = {
    data_at_rest: await collectEncryptionAtRestEvidence(),
    data_in_transit: await collectEncryptionInTransitEvidence(),
    key_management: await collectKeyManagementEvidence()
  };

  // Section 5: Monitoring and Incident Response
  evidencePackage.evidence.monitoring = {
    uptime_reports: await collectUptimeReports(params.audit_period_start, params.audit_period_end),
    security_incidents: await collectSecurityIncidents(params.audit_period_start, params.audit_period_end),
    alert_response_metrics: await collectAlertResponseMetrics(params.audit_period_start, params.audit_period_end)
  };

  // Section 6: Change Management
  evidencePackage.evidence.change_management = {
    approved_changes: await collectApprovedChanges(params.audit_period_start, params.audit_period_end),
    deployment_logs: await collectDeploymentLogs(params.audit_period_start, params.audit_period_end),
    rollback_incidents: await collectRollbackIncidents(params.audit_period_start, params.audit_period_end)
  };

  // Section 7: Backup and Recovery
  evidencePackage.evidence.backup_recovery = {
    backup_logs: await collectBackupLogs(params.audit_period_start, params.audit_period_end),
    restore_tests: await collectRestoreTestRecords(params.audit_period_start, params.audit_period_end)
  };

  // Section 8: Vendor Management
  evidencePackage.evidence.vendor_management = {
    vendor_reviews: await collectVendorReviews(params.audit_period_start, params.audit_period_end),
    sla_compliance: await collectVendorSLACompliance(params.audit_period_start, params.audit_period_end)
  };

  // Generate package manifest with checksums
  evidencePackage.manifest = await generatePackageManifest(evidencePackage);

  // Save to GCS with restricted access
  const packagePath = await saveEvidencePackageToGCS(evidencePackage, {
    auditor_email: params.auditor_name,
    expiration_days: 90
  });

  // Notify auditor
  await notifyAuditor({
    auditor_email: params.auditor_name,
    package_path: packagePath,
    access_instructions: 'Evidence package available via secure link (expires in 90 days)'
  });

  return {
    package_id: evidencePackage.metadata.package_version,
    package_path: packagePath,
    total_evidence_items: Object.keys(evidencePackage.evidence).length
  };
}

7. Integration with HIPAA Audit Logging

7.1 Unified Audit Trail

Cross-Compliance Log Correlation:

The SOC 2 continuous monitoring system integrates with HIPAA audit logging (D.3.4 specification) to provide a unified audit trail that satisfies both compliance frameworks.

Compliance Requirement	HIPAA Citation	SOC 2 TSC	Shared Implementation
PHI Access Logging	§164.312(b)	CC5.1, C1.1	BigQuery `phi_access_logs.entries` table
Audit Log Integrity	§164.312(b)	CC7.2	SHA-256 hash chain
Access Control Monitoring	§164.312(a)(1)	CC5.1	Real-time access log analysis
Encryption Verification	§164.312(a)(2)(iv)	CC6.1	Daily encryption status check
Availability Monitoring	§164.308(a)(7)(ii)(C)	A1.1	Uptime checks + SLA tracking
Incident Detection	§164.308(a)(6)(ii)	CC7.2	Security event alerting

Log Format Compatibility:

Both HIPAA and SOC 2 audit logs use the same structured JSON schema (defined in Section 2.1.3), ensuring:

Single log aggregation pipeline
Unified search and query interface
Consistent evidence for both audit types
Reduced operational complexity

7.2 PHI Access Anomaly Detection Integration

Reuse of D.3.4 Anomaly Detection Rules:

The SOC 2 monitoring system leverages the PHI access anomaly detection rules defined in D.3.4 HIPAA Audit specification:

-- Shared anomaly detection query (used by both HIPAA and SOC 2 dashboards)
CREATE OR REPLACE VIEW `bio-qms-prod.compliance.v_unified_access_anomalies` AS
SELECT
  audit_id,
  timestamp,
  user_id,
  username,
  audit_event_type,
  anomaly_type,
  anomaly_score,
  risk_level,
  'HIPAA' AS compliance_framework
FROM `bio-qms-prod.phi_access_logs.anomalies`

UNION ALL

SELECT
  audit_id,
  timestamp,
  user_id,
  username,
  audit_event_type,
  'NON_PHI_ACCESS_ANOMALY' AS anomaly_type,
  anomaly_score,
  risk_level,
  'SOC2' AS compliance_framework
FROM `bio-qms-prod.audit_logs.entries`
WHERE is_audit_event = true
  AND is_phi_related = false
  AND anomaly_score > 0.6;

8. Compliance KPI Calculation Details

8.1 Audit Readiness Score Formula

Weighted Composite Score:

Audit Readiness Score =
  (Control Test Pass Rate × 0.60) +
  (Evidence Coverage × 0.40)

Where:
  Control Test Pass Rate = (Passed Controls / Total Controls Tested) × 100
  Evidence Coverage = (Controls with Current Evidence / Total Controls in Inventory) × 100

Example Calculation:

Given:
  - Total controls in inventory: 30
  - Controls tested in last 30 days: 28
  - Controls passed: 27
  - Controls with evidence <90 days old: 29

Control Test Pass Rate = (27 / 28) × 100 = 96.43%
Evidence Coverage = (29 / 30) × 100 = 96.67%

Audit Readiness Score = (96.43 × 0.60) + (96.67 × 0.40)
                       = 57.86 + 38.67
                       = 96.53%

Interpretation:

Score Range	Readiness Level	Interpretation
95-100%	Excellent	Audit-ready, all controls operating effectively
90-94%	Good	Minor gaps, address before audit
85-89%	Acceptable	Moderate gaps, remediation plan required
80-84%	Needs Improvement	Significant gaps, delay audit if possible
<80%	Not Ready	Critical gaps, audit will likely result in exceptions

8.2 Risk Score Calculation

Weighted Risk Score Formula:

Risk Score = Σ (Control Failure Count × Control Weight)

Control Weights:
  - CC6.x (Encryption controls): 3x weight
  - CC5.x (Access controls): 2x weight
  - CC7.x (Monitoring controls): 2x weight
  - CC8.x (Change management): 2x weight
  - All other controls: 1x weight

Risk Level Classification:

Weighted Risk Score	Risk Level	Action Required
0	Minimal	Maintain current controls
1-5	Low	Monitor, no immediate action
6-15	Moderate	Develop remediation plan within 30 days
16-30	High	Immediate remediation required (7 days)
>30	Critical	Escalate to executive leadership, pause audit

9. Incident Response Integration

9.1 Alert-to-Incident Workflow

Automated Incident Creation:

// cloud-functions/alert-to-incident/index.ts

export async function createIncidentFromAlert(alert: {
  alert_type: string;
  severity: string;
  details: any;
}) {
  // Create incident in incident management system
  const incident = await createIncident({
    title: `SOC 2 Control Failure: ${alert.alert_type}`,
    severity: mapSeverityToIncidentPriority(alert.severity),
    description: formatIncidentDescription(alert),
    tags: ['soc2', 'compliance', 'automated'],
    assignee: getResponsibleParty(alert.alert_type),
    due_date: calculateDueDate(alert.severity)
  });

  // Link alert to incident
  await linkAlertToIncident(alert, incident.id);

  // Notify assigned party
  await notifyAssignee(incident);

  // Track in compliance database
  await trackComplianceIncident({
    incident_id: incident.id,
    alert_type: alert.alert_type,
    control_id: extractControlId(alert.alert_type),
    severity: alert.severity,
    created_at: new Date(),
    status: 'OPEN'
  });

  return incident;
}

function mapSeverityToIncidentPriority(severity: string): string {
  switch (severity) {
    case 'CRITICAL': return 'P1';
    case 'HIGH': return 'P2';
    case 'MEDIUM': return 'P3';
    default: return 'P4';
  }
}

function calculateDueDate(severity: string): Date {
  const now = new Date();
  switch (severity) {
    case 'CRITICAL': return new Date(now.getTime() + 4 * 60 * 60 * 1000);  // 4 hours
    case 'HIGH': return new Date(now.getTime() + 24 * 60 * 60 * 1000);     // 24 hours
    case 'MEDIUM': return new Date(now.getTime() + 7 * 24 * 60 * 60 * 1000);  // 7 days
    default: return new Date(now.getTime() + 30 * 24 * 60 * 60 * 1000);    // 30 days
  }
}

10. Compliance Review and Reporting

10.1 Monthly Compliance Report Template

Automated Report Generation:

# SOC 2 Compliance Monthly Report
**Reporting Period:** {start_date} - {end_date}
**Generated:** {timestamp}
**Report ID:** {report_id}

## Executive Summary

**Overall Compliance Status:** {status} ({audit_readiness_score}%)

- **Control Test Pass Rate:** {pass_rate}%
- **Evidence Coverage:** {evidence_coverage}%
- **Risk Score:** {risk_score} ({risk_level})
- **Open Control Failures:** {open_failures}
- **Mean Time to Remediate:** {mttr} days

{executive_summary_narrative}

## Control Test Results by TSC Category

{table_of_control_results}

## Control Failures Requiring Attention

{table_of_open_failures}

## Remediation Status

{table_of_remediation_progress}

## Security Events Summary

{security_events_summary}

## Availability SLA Performance

{availability_metrics}

## Upcoming Audit Preparation Tasks

{audit_prep_checklist}

## Appendix: Evidence Inventory

{evidence_inventory}

11. Document Change Control

11.1 Amendment Process

Changes to this specification require:

Impact Assessment - Evaluate impact on existing monitoring infrastructure
Stakeholder Review - CISO, CCO, VP Engineering approval
Testing - Validate changes in staging environment
Documentation Update - Update this specification with revision history
Communication - Notify compliance team, auditors (if material change)

Document	Relationship
`hipaa-audit-reporting.md` (D.3.4)	Shared audit log architecture
`hipaa-encryption-controls.md` (D.3)	Encryption verification integration
`hipaa-access-controls.md` (D.3)	Access control monitoring
SOC 2 Readiness Assessment	Gap analysis and roadmap
Security Architecture (64-security-architecture.md)	Infrastructure context

12. Appendix: Monitoring Architecture Diagrams

12.1 Log Aggregation Flow

Application Pods (GKE)
  │
  ├─ NestJS Backend
  │    └─> Winston Logger (JSON)
  │           └─> stdout/stderr
  │
  ├─ Angular Frontend
  │    └─> Console Logs (captured by Cloud Logging)
  │
  └─ Worker Services
       └─> Structured Logging
                │
                ▼
       Cloud Logging Agent (DaemonSet)
                │
                ├─> Cloud Logging (30 days)
                │      │
                │      ├─> Real-time Log Search
                │      ├─> Log-based Metrics
                │      └─> Alerting Policies
                │
                ├─> BigQuery Sink (1 year)
                │      │
                │      ├─> SQL Analysis
                │      ├─> Dashboards (Looker Studio)
                │      └─> Compliance Reporting
                │
                └─> Cloud Storage Sink (7 years)
                       │
                       └─> Immutable Archive
                             (Part 11 Compliance)

12.2 Alert Escalation Flow

Monitoring Event Detected
        │
        ▼
Log-based Metric Threshold Exceeded
OR
Uptime Check Failure
OR
Security Event (Pub/Sub)
        │
        ▼
Alert Policy Triggered
        │
        ├─> Severity: CRITICAL
        │      │
        │      ├─> PagerDuty (On-call)
        │      ├─> Slack (#incident-response)
        │      └─> Email (CISO + VP Eng)
        │             │
        │             └─> If unresolved in 15 min
        │                    └─> Escalate to CEO
        │
        ├─> Severity: HIGH
        │      │
        │      ├─> Slack (#security-alerts)
        │      └─> Email (Security Team)
        │             │
        │             └─> If unresolved in 30 min
        │                    └─> PagerDuty (CISO)
        │
        └─> Severity: MEDIUM/LOW
               │
               └─> Email + Slack
                      │
                      └─> Normal SLA tracking

End of Document

Document ID: CODITECT-BIO-SOC2-MON-001 Version: 1.0.0 Total Lines: 2,191 Classification: Internal - Restricted

Approval Signatures

Prepared by: Information Security Team Date: 2026-02-16

Approved by:

Next Review Date: 2027-02-16

Document Control​

Approval History​

Revision History​

Distribution List​

Review Schedule​

1. Executive Summary​

1.1 Purpose​

1.2 Scope​

1.3 Regulatory Context​

2. Centralized Logging Architecture​

2.1 Log Aggregation Design​

2.1.1 Google Cloud Logging Integration​

2.1.2 Log Categories and Retention​

2.1.3 Structured Log Schema​

2.2 Log Integrity Protection​

2.2.1 SHA-256 Hash Chain for Tamper Detection​

2.2.2 Immutable Log Storage​

2.3 Log Routing and Filtering​

2.3.1 Cloud Logging Sinks Configuration​

2.3.2 Log Sampling and Rate Limiting​

3. Real-Time Security Event Alerting​

3.1 Alert Definitions and Thresholds​

3.1.1 Authentication Failure Alerts​

3.1.2 Privilege Escalation Alerts​

3.1.3 PHI Access Anomaly Alerts​

3.1.4 System Availability Alerts​

3.1.5 Certificate Expiration Alerts​

3.1.6 Encryption Key Rotation Alerts​

3.2 Alert Channels and Escalation Matrix​

3.2.1 Notification Channels Configuration​

3.2.2 Escalation Matrix with Response SLAs​

3.2.3 Alert Suppression and Maintenance Windows​

4. SOC 2 Control Effectiveness Dashboard​

4.1 Per-TSC Control Status Tracking​

4.1.1 Control Inventory and Mapping​

4.1.2 Automated Control Test Execution​

4.2 Compliance KPI Dashboard​

4.2.1 Key Performance Indicators​

4.2.2 Dashboard Implementation (Looker Studio)​

5. Automated Health Checks​

5.1 Daily Health Check Schedule​

5.2 Weekly Health Check Schedule​

5.3 Monthly Health Check Schedule​

6. Evidence Collection and Audit Preparation​

6.1 Automated Evidence Collection​

6.2 SOC 2 Audit Evidence Package​

7. Integration with HIPAA Audit Logging​

7.1 Unified Audit Trail​

7.2 PHI Access Anomaly Detection Integration​

8. Compliance KPI Calculation Details​

8.1 Audit Readiness Score Formula​

8.2 Risk Score Calculation​

9. Incident Response Integration​

9.1 Alert-to-Incident Workflow​

10. Compliance Review and Reporting​

10.1 Monthly Compliance Report Template​

11. Document Change Control​

11.1 Amendment Process​

11.2 Related Documentation​

12. Appendix: Monitoring Architecture Diagrams​

12.1 Log Aggregation Flow​

12.2 Alert Escalation Flow​

Approval Signatures​