SOC 2 Continuous Monitoring Controls
Document ID: CODITECT-BIO-SOC2-MON-001 Version: 1.0.0 Effective Date: 2026-02-16 Classification: Internal - Restricted Owner: Chief Information Security Officer (CISO) / Chief Compliance Officer (CCO)
Document Control
Approval History
| Role | Name | Signature | Date |
|---|---|---|---|
| Chief Information Security Officer | [Pending] | [Digital Signature] | YYYY-MM-DD |
| Chief Compliance Officer | [Pending] | [Digital Signature] | YYYY-MM-DD |
| VP Engineering | [Pending] | [Digital Signature] | YYYY-MM-DD |
| VP Quality Assurance | [Pending] | [Digital Signature] | YYYY-MM-DD |
| Internal Audit Manager | [Pending] | [Digital Signature] | YYYY-MM-DD |
Revision History
| Version | Date | Author | Changes | Approval Status |
|---|---|---|---|---|
| 1.0.0 | 2026-02-16 | CISO Office | Initial release | Draft |
Distribution List
- Executive Leadership Team
- Information Security Team
- Compliance Team
- Internal Audit
- External SOC 2 Auditor (during audit engagement)
- Quality Assurance Team
- Engineering Leadership
- DevOps/SRE Team
Review Schedule
| Review Type | Frequency | Next Review Date | Responsible Party |
|---|---|---|---|
| Annual Review | 12 months | 2027-02-16 | CISO |
| Quarterly Control Effectiveness Review | 3 months | 2026-05-16 | Compliance Team |
| Post-Audit Review | After each SOC 2 audit | N/A | CCO + CISO |
| Technology Stack Review | 12 months | 2027-02-16 | VP Engineering |
| KPI Threshold Review | 6 months | 2026-08-16 | Compliance Team |
1. Executive Summary
1.1 Purpose
This SOC 2 Continuous Monitoring Controls specification establishes comprehensive, automated monitoring and control validation capabilities for the CODITECT Biosciences Quality Management System (BIO-QMS) Platform to ensure:
- Real-Time Control Effectiveness - Continuous validation that SOC 2 controls are operating as designed
- Proactive Risk Detection - Immediate identification and alerting on control failures or security events
- Audit Readiness - Automated evidence collection and control test results always available for SOC 2 Type II audits
- Operational Excellence - System health, availability, and security metrics meet or exceed SLA commitments
- Regulatory Compliance - Full conformance with AICPA Trust Services Criteria (TSC) 2017 monitoring requirements
1.2 Scope
This specification applies to:
In Scope:
- All SOC 2 Trust Services Criteria controls (CC1-CC9, A1, PI1, C1)
- Centralized logging infrastructure (application, audit, access, error, system logs)
- Real-time security event alerting system
- Automated control health checks and validation
- SOC 2 control effectiveness dashboard and reporting
- Compliance KPI tracking and trending
- Integration with HIPAA audit logging (D.3.4) for PHI access monitoring
- Evidence collection and retention for SOC 2 Type II audits
- Incident response integration and escalation management
Out of Scope:
- Application performance monitoring (APM) for optimization (covered in observability architecture)
- Business intelligence and analytics (covered in reporting architecture)
- Cost optimization monitoring (covered in FinOps processes)
- Penetration testing and vulnerability scanning (covered in D.4.1 - separate specification)
1.3 Regulatory Context
SOC 2 Trust Services Criteria (TSC) 2017 - Monitoring Requirements:
| TSC Category | Criteria | Monitoring Requirement | Implementation |
|---|---|---|---|
| CC7.1 | System Operations | Monitor system components and quality of operations | Real-time infrastructure health checks |
| CC7.2 | System Operations | Detect and resolve incidents/anomalies | Automated alerting + incident response integration |
| CC7.3 | System Operations | Evaluate and respond to incidents | Escalation matrix + response SLAs |
| CC7.4 | System Operations | Identify and manage changes | Change log monitoring + approval verification |
| CC8.1 | Change Management | Authorize and approve changes | Pre/post-change control validation |
| A1.1 | Availability | Monitor system availability | Uptime monitoring + SLA compliance tracking |
| A1.2 | Availability | Respond to availability incidents | Availability alerting + recovery time tracking |
| C1.1 | Confidentiality | Restrict access to confidential info | Access control monitoring + anomaly detection |
| C1.2 | Confidentiality | Dispose of confidential info | Secure deletion verification + audit logging |
NIST SP 800-137 - Continuous Monitoring:
- Define continuous monitoring strategy aligned with risk tolerance
- Establish metrics, status assessments, and data feeds
- Implement technical mechanisms for automated data collection
- Analyze data and report findings to support risk-based decisions
- Respond to findings with prioritized corrective actions
2. Centralized Logging Architecture
2.1 Log Aggregation Design
2.1.1 Google Cloud Logging Integration
Architecture Overview:
┌─────────────────────────────────────────────────────────────────┐
│ BIO-QMS Application Layer │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐│
│ │ NestJS │ │ Frontend │ │ Worker │ │ Batch ││
│ │ Backend │ │ Angular │ │ Services │ │ Jobs ││
│ └─────┬──────┘ └─────┬──────┘ └─────┬──────┘ └─────┬──────┘│
│ │ JSON │ JSON │ JSON │ JSON │
│ └────────────────┴────────────────┴────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
│
┌────────▼────────┐
│ Cloud Logging │
│ Agent (ops) │
└────────┬────────┘
│
┌───────────────────┼───────────────────┐
│ │ │
┌────────▼────────┐ ┌────────▼────────┐ ┌────────▼────────┐
│ Cloud Logging │ │ BigQuery │ │ Cloud Storage │
│ (30 day hot) │ │ (1 yr online) │ │ (7 yr archive) │
│ │ │ │ │ │
│ - Query/Search │ │ - SQL analysis │ │ - Compliance │
│ - Real-time │ │ - Dashboards │ │ - Long-term │
│ - Alerts │ │ - Reporting │ │ - Immutable │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌─────────▼─────────┐
│ Log Integrity │
│ Verification │
│ (SHA-256 chain) │
└───────────────────┘
Implementation Details:
| Component | Technology | Configuration |
|---|---|---|
| Log Collection | Google Cloud Logging Agent | Installed on all GKE nodes, configured via ConfigMap |
| Structured Logging | Winston (NestJS) | JSON format with consistent schema |
| Hot Storage | Cloud Logging | 30-day retention, full-text search, real-time streaming |
| Online Archive | BigQuery | 1-year retention, SQL queries, dashboard integration |
| Long-Term Archive | Cloud Storage (GCS) | 7-year retention, immutable bucket, lifecycle policy |
| Log Router | Cloud Logging Sinks | Filter-based routing to BigQuery, GCS, alerting |
| Integrity Verification | Cloud Functions + Cloud KMS | SHA-256 hash chain, hourly verification |
2.1.2 Log Categories and Retention
Log Types with Retention Requirements:
| Log Type | Description | Retention (Hot) | Retention (Online) | Retention (Archive) | Part 11 Required |
|---|---|---|---|---|---|
| Application Logs | NestJS structured JSON logs (info, warn, error) | 30 days | 1 year | 7 years | Yes |
| Audit Logs | PHI access, user actions, system changes (immutable) | 30 days | 1 year | 7 years | Yes |
| Access Logs | HTTP requests, API calls, authentication events | 30 days | 1 year | 7 years | Yes |
| Security Logs | Failed auth, privilege escalation, anomalies | 30 days | 1 year | 7 years | Yes |
| Error Logs | Application exceptions, stack traces, errors | 30 days | 1 year | 7 years | Yes |
| System Logs | OS, Kubernetes, infrastructure events | 30 days | 1 year | 3 years | No |
| Performance Logs | Latency, throughput, resource utilization | 30 days | 90 days | N/A | No |
| Change Logs | Code deployments, configuration changes, migrations | 30 days | 1 year | 7 years | Yes |
Retention Rationale:
- 30-day hot storage: Immediate investigation, real-time alerting, operational troubleshooting
- 1-year online archive: SOC 2 Type II audit period (12 months of control evidence), HIPAA compliance
- 7-year long-term archive: FDA 21 CFR Part 11 §11.10(e) requirement for electronic record retention
2.1.3 Structured Log Schema
Standard Application Log Format (Winston/NestJS):
interface ApplicationLogEntry {
// Timestamp
timestamp: string; // ISO 8601 UTC (e.g., "2026-02-16T14:30:45.123Z")
// Severity
severity: 'DEBUG' | 'INFO' | 'WARN' | 'ERROR' | 'CRITICAL';
// Source identification
service: string; // "nestjs-backend", "angular-frontend", "worker-service"
component: string; // Module/component name (e.g., "UserController", "AuthService")
instance_id: string; // Container/pod ID for distributed tracing
// Message
message: string; // Human-readable log message
// Context
context: {
user_id?: string; // Authenticated user (if applicable)
organization_id?: string; // Tenant organization
session_id?: string; // Session identifier
request_id?: string; // Unique request ID for distributed tracing
ip_address?: string; // Source IP (hashed if PII concern)
user_agent?: string; // Browser/client user agent
correlation_id?: string; // Cross-service correlation
};
// Technical details
stack_trace?: string; // For ERROR/CRITICAL severity
error_code?: string; // Application error code
// Compliance
is_phi_related: boolean; // Flag for PHI-related operations
is_audit_event: boolean; // Flag for audit-required events
// Metadata
labels: Record<string, string>; // Custom key-value labels
resource: {
type: string; // "gke_container", "cloud_function", etc.
labels: Record<string, string>; // GCP resource labels
};
}
Example Application Log Entry:
{
"timestamp": "2026-02-16T14:30:45.123Z",
"severity": "INFO",
"service": "nestjs-backend",
"component": "DocumentController",
"instance_id": "backend-7d9f8c-xk2p5",
"message": "Document created successfully",
"context": {
"user_id": "usr_1a2b3c4d",
"organization_id": "org_5e6f7g8h",
"session_id": "sess_9i0j1k2l",
"request_id": "req_3m4n5o6p",
"ip_address": "10.1.2.34",
"user_agent": "Mozilla/5.0..."
},
"is_phi_related": false,
"is_audit_event": false,
"labels": {
"action": "create",
"resource_type": "document",
"document_id": "doc_7q8r9s0t"
},
"resource": {
"type": "gke_container",
"labels": {
"cluster_name": "bio-qms-prod",
"namespace_name": "default",
"pod_name": "backend-7d9f8c-xk2p5"
}
}
}
Audit Log Format (HIPAA/Part 11 Compliance):
Audit logs extend the application log schema with additional fields per hipaa-audit-reporting.md specification:
interface AuditLogEntry extends ApplicationLogEntry {
// Audit-specific fields
audit_id: string; // UUID v4
audit_event_type: string; // "PHI_ACCESS_READ", "USER_LOGIN", "CONFIG_CHANGE", etc.
actor: {
user_id: string;
username: string;
role: string;
organization_id: string;
};
target: {
resource_type: string; // "patient_record", "user_account", "system_config"
resource_id: string;
resource_name?: string;
};
action: string; // "read", "write", "delete", "export", "approve"
outcome: 'SUCCESS' | 'FAILURE' | 'PARTIAL';
phi_fields_accessed?: string[]; // List of PHI field names (if applicable)
reason?: string; // Justification for access (if required)
// Immutability
hash_previous: string; // SHA-256 of previous audit entry (blockchain-style)
hash_current: string; // SHA-256 of this entry
signature?: string; // Digital signature (optional, for critical events)
}
2.2 Log Integrity Protection
2.2.1 SHA-256 Hash Chain for Tamper Detection
Hash Chain Implementation:
Audit Entry 1:
- audit_id: "aud_001"
- timestamp: "2026-02-16T10:00:00Z"
- data: { user_id: "usr_123", action: "read", ... }
- hash_previous: "0000000000000000" (genesis entry)
- hash_current: SHA256(audit_id + timestamp + data + hash_previous)
= "a3f5c9e1..."
Audit Entry 2:
- audit_id: "aud_002"
- timestamp: "2026-02-16T10:00:15Z"
- data: { user_id: "usr_456", action: "write", ... }
- hash_previous: "a3f5c9e1..." (hash_current from Entry 1)
- hash_current: SHA256(audit_id + timestamp + data + hash_previous)
= "7b2d8f4a..."
[If Entry 1 is tampered with, hash_current changes, breaking the chain]
Verification Process:
-
Hourly Batch Verification - Cloud Function triggered every hour
- Retrieve all audit entries from the past hour from BigQuery
- Recompute
hash_currentfor each entry usinghash_previousfrom prior entry - Compare recomputed hash with stored hash
- Alert if mismatch detected (indicates tampering)
-
On-Demand Verification - Via API or admin console
- Allows auditors to verify integrity of specific date range
- Generates verification report with pass/fail status
- Signed with Cloud KMS key for non-repudiation
Cloud Function Implementation:
// cloud-functions/log-integrity-verifier/index.ts
import { BigQuery } from '@google-cloud/bigquery';
import { createHash } from 'crypto';
export async function verifyAuditLogIntegrity(request, response) {
const bigquery = new BigQuery();
// Query audit logs from past hour, ordered by timestamp
const query = `
SELECT audit_id, timestamp, data, hash_previous, hash_current
FROM \`project.dataset.audit_logs\`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
ORDER BY timestamp ASC
`;
const [rows] = await bigquery.query(query);
let previousHash = null;
const tamperedEntries = [];
for (const row of rows) {
// Compute expected hash
const dataString = JSON.stringify(row.data);
const expectedHash = createHash('sha256')
.update(row.audit_id + row.timestamp + dataString + (previousHash || '0000000000000000'))
.digest('hex');
// Compare with stored hash
if (expectedHash !== row.hash_current) {
tamperedEntries.push({
audit_id: row.audit_id,
timestamp: row.timestamp,
expected_hash: expectedHash,
stored_hash: row.hash_current
});
}
previousHash = row.hash_current;
}
if (tamperedEntries.length > 0) {
// CRITICAL ALERT - Log tampering detected
await sendCriticalAlert('AUDIT_LOG_TAMPERING', {
count: tamperedEntries.length,
entries: tamperedEntries
});
}
response.json({
verified_entries: rows.length,
tampered_entries: tamperedEntries.length,
status: tamperedEntries.length === 0 ? 'PASS' : 'FAIL',
details: tamperedEntries
});
}
2.2.2 Immutable Log Storage
Cloud Storage Configuration for Archive Tier:
# terraform/storage.tf
resource "google_storage_bucket" "audit_log_archive" {
name = "bio-qms-audit-logs-archive-${var.environment}"
location = "US"
# Immutability - prevent deletion or modification
retention_policy {
retention_period = 220752000 # 7 years in seconds
is_locked = true # Cannot be removed once locked
}
# Versioning (for accidental overwrites before lock)
versioning {
enabled = true
}
# Lifecycle policy - transition to coldline after 1 year
lifecycle_rule {
condition {
age = 365
}
action {
type = "SetStorageClass"
storage_class = "COLDLINE"
}
}
# Prevent public access
uniform_bucket_level_access = true
# Encryption
encryption {
default_kms_key_name = google_kms_crypto_key.audit_log_key.id
}
# Audit logging for the audit logs (meta!)
logging {
log_bucket = "bio-qms-admin-logs"
}
}
BigQuery Dataset Configuration for Online Archive:
-- Create dataset with 1-year retention
CREATE SCHEMA IF NOT EXISTS `bio-qms-prod.audit_logs`
OPTIONS (
location = 'US',
default_table_expiration_ms = 31536000000 -- 1 year in milliseconds
);
-- Create audit_logs table with partitioning
CREATE TABLE IF NOT EXISTS `bio-qms-prod.audit_logs.entries` (
audit_id STRING NOT NULL,
timestamp TIMESTAMP NOT NULL,
audit_event_type STRING NOT NULL,
severity STRING NOT NULL,
actor STRUCT<
user_id STRING,
username STRING,
role STRING,
organization_id STRING
>,
target STRUCT<
resource_type STRING,
resource_id STRING,
resource_name STRING
>,
action STRING,
outcome STRING,
data JSON,
hash_previous STRING,
hash_current STRING,
signature STRING
)
PARTITION BY DATE(timestamp)
CLUSTER BY organization_id, audit_event_type
OPTIONS (
require_partition_filter = true,
partition_expiration_days = 365
);
2.3 Log Routing and Filtering
2.3.1 Cloud Logging Sinks Configuration
Log Router Sinks (Terraform):
# terraform/logging.tf
# Sink 1: All audit logs to BigQuery (online archive)
resource "google_logging_project_sink" "audit_logs_bigquery" {
name = "audit-logs-to-bigquery"
destination = "bigquery.googleapis.com/projects/${var.project_id}/datasets/audit_logs"
filter = <<-EOT
(jsonPayload.is_audit_event = true)
OR
(labels.audit_event_type != "")
OR
(resource.type = "gke_container" AND severity >= "WARNING")
EOT
unique_writer_identity = true
bigquery_options {
use_partitioned_tables = true
}
}
# Sink 2: All logs to Cloud Storage (long-term archive)
resource "google_logging_project_sink" "all_logs_storage" {
name = "all-logs-to-storage"
destination = "storage.googleapis.com/${google_storage_bucket.audit_log_archive.name}"
filter = "resource.type = \"gke_container\"" # All GKE container logs
unique_writer_identity = true
}
# Sink 3: Security events to real-time alerting
resource "google_logging_project_sink" "security_events_pubsub" {
name = "security-events-to-pubsub"
destination = "pubsub.googleapis.com/projects/${var.project_id}/topics/security-events"
filter = <<-EOT
(jsonPayload.audit_event_type =~ "FAILED_AUTH.*")
OR
(jsonPayload.audit_event_type = "PRIVILEGE_ESCALATION")
OR
(jsonPayload.audit_event_type = "PHI_ACCESS_ANOMALY")
OR
(severity >= "ERROR" AND jsonPayload.is_phi_related = true)
EOT
unique_writer_identity = true
}
# Sink 4: PHI access logs (HIPAA compliance)
resource "google_logging_project_sink" "phi_access_bigquery" {
name = "phi-access-to-bigquery"
destination = "bigquery.googleapis.com/projects/${var.project_id}/datasets/phi_access_logs"
filter = <<-EOT
jsonPayload.is_phi_related = true
AND
jsonPayload.audit_event_type =~ "PHI_.*"
EOT
unique_writer_identity = true
bigquery_options {
use_partitioned_tables = true
}
}
2.3.2 Log Sampling and Rate Limiting
High-Volume Log Sampling Configuration:
// src/common/logging/winston.config.ts
import { format, transports } from 'winston';
// Custom sampling format for high-volume debug logs
const samplingFormat = format((info) => {
// Sample DEBUG logs at 10% (keep 1 in 10)
if (info.level === 'debug') {
const shouldKeep = Math.random() < 0.1;
return shouldKeep ? info : false;
}
// Always keep INFO and above
return info;
});
export const loggerConfig = {
level: process.env.LOG_LEVEL || 'info',
format: format.combine(
samplingFormat(),
format.timestamp({ format: 'YYYY-MM-DDTHH:mm:ss.SSSZ' }),
format.errors({ stack: true }),
format.json()
),
transports: [
// Console for Cloud Logging to pick up
new transports.Console({
level: 'debug',
handleExceptions: true,
handleRejections: true
})
],
// Exception/rejection handling
exceptionHandlers: [
new transports.Console()
],
rejectionHandlers: [
new transports.Console()
]
};
Rate Limiting for Repeated Errors:
// src/common/logging/rate-limiter.ts
import { createHash } from 'crypto';
class LogRateLimiter {
private counters: Map<string, { count: number; firstSeen: number }> = new Map();
private readonly windowMs = 60000; // 1 minute
private readonly maxPerWindow = 10;
shouldLog(message: string, severity: string): boolean {
// Never rate-limit CRITICAL or audit events
if (severity === 'CRITICAL' || severity === 'AUDIT') {
return true;
}
// Create hash key from message content
const key = createHash('md5').update(message).digest('hex');
const now = Date.now();
const counter = this.counters.get(key);
if (!counter) {
this.counters.set(key, { count: 1, firstSeen: now });
return true;
}
// Reset if window expired
if (now - counter.firstSeen > this.windowMs) {
this.counters.set(key, { count: 1, firstSeen: now });
return true;
}
// Increment counter
counter.count++;
// Allow up to maxPerWindow, then log summary message
if (counter.count <= this.maxPerWindow) {
return true;
} else if (counter.count === this.maxPerWindow + 1) {
// Log one final message indicating rate limiting
console.warn({
message: `Log rate limit reached for message: "${message.substring(0, 100)}..."`,
suppressed_count: 1,
window_ms: this.windowMs
});
return false;
} else {
// Suppress
return false;
}
}
// Cleanup old entries every 5 minutes
cleanup() {
const now = Date.now();
for (const [key, counter] of this.counters.entries()) {
if (now - counter.firstSeen > this.windowMs * 5) {
this.counters.delete(key);
}
}
}
}
export const rateLimiter = new LogRateLimiter();
setInterval(() => rateLimiter.cleanup(), 300000); // 5 minutes
3. Real-Time Security Event Alerting
3.1 Alert Definitions and Thresholds
3.1.1 Authentication Failure Alerts
Alert Rule: Failed Authentication Attempts
| Parameter | Value |
|---|---|
| Alert Name | FAILED_AUTH_THRESHOLD_EXCEEDED |
| Trigger Condition | ≥5 failed authentication attempts from same user OR same IP in 10-minute window |
| Severity | HIGH |
| Data Source | Cloud Logging (filter: audit_event_type = "FAILED_AUTH") |
| Detection Method | Log-based metric with threshold alert |
| Response SLA | 15 minutes |
| Escalation | Security team → CISO (if unresolved in 30 min) |
| Auto-Remediation | Temporary IP block (1 hour) after 10 failed attempts |
Implementation (Terraform + Monitoring):
# terraform/monitoring.tf
# Log-based metric: count failed auth by user
resource "google_logging_metric" "failed_auth_by_user" {
name = "failed_auth_by_user_count"
filter = "jsonPayload.audit_event_type=\"FAILED_AUTH\""
metric_descriptor {
metric_kind = "DELTA"
value_type = "INT64"
labels {
key = "user_id"
value_type = "STRING"
description = "User attempting authentication"
}
labels {
key = "ip_address"
value_type = "STRING"
description = "Source IP address"
}
}
label_extractors = {
"user_id" = "EXTRACT(jsonPayload.actor.user_id)"
"ip_address" = "EXTRACT(jsonPayload.context.ip_address)"
}
}
# Alert policy: trigger when threshold exceeded
resource "google_monitoring_alert_policy" "failed_auth_alert" {
display_name = "Failed Authentication Threshold Exceeded"
combiner = "OR"
conditions {
display_name = "Failed auth count ≥ 5 in 10 minutes"
condition_threshold {
filter = "metric.type=\"logging.googleapis.com/user/failed_auth_by_user_count\""
duration = "600s" # 10 minutes
comparison = "COMPARISON_GT"
threshold_value = 5
aggregations {
alignment_period = "600s"
per_series_aligner = "ALIGN_SUM"
group_by_fields = ["metric.label.user_id", "metric.label.ip_address"]
}
}
}
notification_channels = [
google_monitoring_notification_channel.pagerduty_security.name,
google_monitoring_notification_channel.slack_security.name
]
documentation {
content = <<-EOT
**Alert:** Failed Authentication Threshold Exceeded
**Severity:** HIGH
**Action Required:**
1. Investigate source IP and user account
2. Check for credential stuffing or brute force attack
3. Verify account is not compromised
4. Consider temporary account lock if suspicious
**Runbook:** https://wiki.internal/security/failed-auth-response
EOT
}
alert_strategy {
auto_close = "86400s" # 24 hours
}
}
3.1.2 Privilege Escalation Alerts
Alert Rule: Privilege Escalation Attempts
| Parameter | Value |
|---|---|
| Alert Name | PRIVILEGE_ESCALATION_DETECTED |
| Trigger Condition | Any attempt to elevate privileges or access role-restricted resources without authorization |
| Severity | CRITICAL |
| Data Source | Cloud Logging (filter: audit_event_type = "PRIVILEGE_ESCALATION") |
| Detection Method | Real-time log streaming to Pub/Sub → Cloud Function |
| Response SLA | Immediate (5 minutes) |
| Escalation | CISO + Security team (immediate notification) |
| Auto-Remediation | Session termination + account suspension pending investigation |
Implementation (Cloud Function):
// cloud-functions/privilege-escalation-detector/index.ts
import { PubSub } from '@google-cloud/pubsub';
import { sendPagerDutyAlert, sendSlackAlert } from './alerting';
import { terminateUserSession, suspendUserAccount } from './remediation';
export async function handlePrivilegeEscalation(message: any, context: any) {
const logEntry = JSON.parse(Buffer.from(message.data, 'base64').toString());
if (logEntry.jsonPayload?.audit_event_type === 'PRIVILEGE_ESCALATION') {
const {
user_id,
username,
session_id,
attempted_role,
current_role,
ip_address,
timestamp
} = logEntry.jsonPayload;
// CRITICAL ALERT
await sendPagerDutyAlert({
severity: 'critical',
summary: `Privilege escalation attempt by ${username} (${user_id})`,
details: {
user_id,
username,
session_id,
current_role,
attempted_role,
ip_address,
timestamp
},
incident_key: `privilege_escalation_${user_id}_${Date.now()}`
});
await sendSlackAlert({
channel: '#security-incidents',
severity: 'critical',
title: 'PRIVILEGE ESCALATION DETECTED',
message: `User \`${username}\` attempted to escalate from \`${current_role}\` to \`${attempted_role}\` at ${timestamp}`,
fields: [
{ title: 'User ID', value: user_id, short: true },
{ title: 'Session ID', value: session_id, short: true },
{ title: 'IP Address', value: ip_address, short: true },
{ title: 'Current Role', value: current_role, short: true },
{ title: 'Attempted Role', value: attempted_role, short: true }
]
});
// AUTO-REMEDIATION
console.log(`Terminating session ${session_id} for user ${user_id}`);
await terminateUserSession(session_id);
console.log(`Suspending account ${user_id} pending investigation`);
await suspendUserAccount(user_id, {
reason: 'Privilege escalation attempt detected',
timestamp,
auto_suspended: true
});
}
}
3.1.3 PHI Access Anomaly Alerts
Alert Rule: PHI Access Anomalies
| Parameter | Value |
|---|---|
| Alert Name | PHI_ACCESS_ANOMALY_DETECTED |
| Trigger Condition | Anomaly detection rules from D.3.4 HIPAA Audit specification |
| Severity | HIGH |
| Data Source | BigQuery ML anomaly model + real-time log stream |
| Detection Method | Machine learning model scoring access patterns |
| Response SLA | 30 minutes |
| Escalation | Privacy Officer + Security team |
| Auto-Remediation | Flag for manual review (no auto-block due to false positive risk) |
Anomaly Detection Rules (from D.3.4):
- After-Hours Access: PHI access outside business hours (6 PM - 8 AM, weekends)
- Bulk Access: >50 patient records accessed in single session
- Geographic Anomaly: Access from unexpected location (IP geolocation)
- Role-Inappropriate Access: User accesses records outside their department
- Terminated Employee Access: Access attempt by deactivated account
- Export Anomaly: Unusual PHI export volume or frequency
- Access Velocity: >10 different patient records in 5 minutes
BigQuery ML Model for Anomaly Scoring:
-- Create anomaly detection model (trained on historical access patterns)
CREATE OR REPLACE MODEL `bio-qms-prod.ml_models.phi_access_anomaly`
OPTIONS (
model_type = 'AUTOML_CLASSIFIER',
input_label_cols = ['is_anomaly']
) AS
SELECT
user_id,
role,
EXTRACT(HOUR FROM timestamp) AS hour_of_day,
EXTRACT(DAYOFWEEK FROM timestamp) AS day_of_week,
COUNT(*) OVER (
PARTITION BY user_id, DATE(timestamp)
) AS daily_access_count,
COUNT(*) OVER (
PARTITION BY user_id
ORDER BY UNIX_SECONDS(timestamp)
RANGE BETWEEN 300 PRECEDING AND CURRENT ROW
) AS access_velocity_5min,
ST_GEOGPOINT(ip_longitude, ip_latitude) AS access_location,
is_anomaly -- Historical labels for training
FROM `bio-qms-prod.phi_access_logs.entries`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY);
-- Real-time scoring query (triggered every 5 minutes)
SELECT
audit_id,
user_id,
username,
timestamp,
anomaly_probability,
CASE
WHEN anomaly_probability > 0.8 THEN 'CRITICAL'
WHEN anomaly_probability > 0.6 THEN 'HIGH'
WHEN anomaly_probability > 0.4 THEN 'MEDIUM'
ELSE 'LOW'
END AS risk_level
FROM ML.PREDICT(
MODEL `bio-qms-prod.ml_models.phi_access_anomaly`,
(
SELECT * FROM `bio-qms-prod.phi_access_logs.entries`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 5 MINUTE)
)
)
WHERE anomaly_probability > 0.6; -- Alert threshold
3.1.4 System Availability Alerts
Alert Rule: System Availability Degradation
| Parameter | Value |
|---|---|
| Alert Name | SYSTEM_AVAILABILITY_SLA_BREACH |
| Trigger Condition | System availability drops below 99.9% SLA threshold |
| Severity | CRITICAL |
| Data Source | Cloud Monitoring uptime checks + load balancer metrics |
| Detection Method | Uptime check failure OR error rate >0.1% |
| Response SLA | Immediate (5 minutes) |
| Escalation | On-call SRE → VP Engineering |
| Auto-Remediation | Auto-scaling triggers + failover to DR region (if multi-region) |
Uptime Check Configuration:
# terraform/monitoring.tf
resource "google_monitoring_uptime_check_config" "bio_qms_frontend" {
display_name = "BIO-QMS Frontend HTTPS"
timeout = "10s"
period = "60s"
http_check {
path = "/health"
port = 443
use_ssl = true
validate_ssl = true
request_method = "GET"
accepted_response_status_codes {
status_value = 200
}
}
monitored_resource {
type = "uptime_url"
labels = {
project_id = var.project_id
host = "bio-qms.coditect.ai"
}
}
content_matchers {
content = "\"status\":\"healthy\""
matcher = "CONTAINS_STRING"
}
}
resource "google_monitoring_uptime_check_config" "bio_qms_api" {
display_name = "BIO-QMS API /health"
timeout = "10s"
period = "60s"
http_check {
path = "/api/health"
port = 443
use_ssl = true
validate_ssl = true
request_method = "GET"
accepted_response_status_codes {
status_value = 200
}
}
monitored_resource {
type = "uptime_url"
labels = {
project_id = var.project_id
host = "api.bio-qms.coditect.ai"
}
}
}
# Alert on uptime check failure
resource "google_monitoring_alert_policy" "uptime_check_failure" {
display_name = "BIO-QMS Uptime Check Failure (Availability SLA)"
combiner = "OR"
conditions {
display_name = "Uptime check failed"
condition_threshold {
filter = "metric.type=\"monitoring.googleapis.com/uptime_check/check_passed\" AND resource.type=\"uptime_url\""
duration = "300s" # 5 minutes
comparison = "COMPARISON_LT"
threshold_value = 1 # Expect 1 (success), alert if < 1
aggregations {
alignment_period = "60s"
per_series_aligner = "ALIGN_FRACTION_TRUE"
cross_series_reducer = "REDUCE_MEAN"
}
}
}
notification_channels = [
google_monitoring_notification_channel.pagerduty_oncall.name,
google_monitoring_notification_channel.slack_incidents.name
]
documentation {
content = <<-EOT
**Alert:** System Availability SLA Breach
**Severity:** CRITICAL
**SLA Target:** 99.9% (43.2 minutes downtime/month max)
**Immediate Actions:**
1. Check GKE cluster health: kubectl get nodes
2. Check pod status: kubectl get pods -n default
3. Check load balancer: gcloud compute forwarding-rules list
4. Review recent deployments (possible rollback needed)
5. Check Cloud SQL status
**Escalation:** If unresolved in 15 minutes, page VP Engineering
**Runbook:** https://wiki.internal/sre/availability-incident-response
EOT
}
}
3.1.5 Certificate Expiration Alerts
Alert Rule: TLS Certificate Expiration Warning
| Parameter | Value |
|---|---|
| Alert Name | TLS_CERTIFICATE_EXPIRING |
| Trigger Condition | Certificate expires in ≤90 days (warnings at 90/60/30/7 days) |
| Severity | MEDIUM (90d), HIGH (30d), CRITICAL (7d) |
| Data Source | Cloud Function scanning all TLS endpoints daily |
| Detection Method | X.509 certificate expiration date parsing |
| Response SLA | 90 days: 7 days, 30 days: 24 hours, 7 days: 4 hours |
| Escalation | Security team → CISO (at 7-day threshold) |
| Auto-Remediation | Trigger cert renewal workflow (Let's Encrypt/Google-managed) |
Certificate Monitoring Cloud Function:
// cloud-functions/certificate-monitor/index.ts
import * as tls from 'tls';
import { sendAlert } from './alerting';
interface CertificateEndpoint {
name: string;
host: string;
port: number;
}
const endpoints: CertificateEndpoint[] = [
{ name: 'Frontend', host: 'bio-qms.coditect.ai', port: 443 },
{ name: 'API', host: 'api.bio-qms.coditect.ai', port: 443 },
{ name: 'Admin', host: 'admin.bio-qms.coditect.ai', port: 443 }
];
export async function checkCertificates() {
const now = new Date();
const results = [];
for (const endpoint of endpoints) {
const cert = await getCertificate(endpoint.host, endpoint.port);
const expirationDate = new Date(cert.valid_to);
const daysUntilExpiry = Math.floor((expirationDate.getTime() - now.getTime()) / (1000 * 60 * 60 * 24));
results.push({
endpoint: endpoint.name,
host: endpoint.host,
expiration_date: expirationDate.toISOString(),
days_until_expiry: daysUntilExpiry,
issuer: cert.issuer,
subject: cert.subject
});
// Alert thresholds
let severity: string | null = null;
if (daysUntilExpiry <= 7) {
severity = 'critical';
} else if (daysUntilExpiry <= 30) {
severity = 'high';
} else if (daysUntilExpiry <= 60) {
severity = 'medium';
} else if (daysUntilExpiry <= 90) {
severity = 'low';
}
if (severity) {
await sendAlert({
severity,
summary: `TLS certificate for ${endpoint.name} (${endpoint.host}) expires in ${daysUntilExpiry} days`,
details: {
endpoint: endpoint.name,
host: endpoint.host,
expiration_date: expirationDate.toISOString(),
days_until_expiry: daysUntilExpiry,
issuer: cert.issuer.O,
subject: cert.subject.CN
}
});
}
}
// Store results in BigQuery for historical tracking
await storeCertificateCheckResults(results);
return { checked: endpoints.length, results };
}
function getCertificate(host: string, port: number): Promise<any> {
return new Promise((resolve, reject) => {
const socket = tls.connect(port, host, { rejectUnauthorized: false }, () => {
const cert = socket.getPeerCertificate();
socket.end();
resolve(cert);
});
socket.on('error', reject);
});
}
3.1.6 Encryption Key Rotation Alerts
Alert Rule: Encryption Key Rotation Overdue
| Parameter | Value |
|---|---|
| Alert Name | ENCRYPTION_KEY_ROTATION_OVERDUE |
| Trigger Condition | Key has not been rotated in >90 days (policy: rotate every 90 days) |
| Severity | HIGH |
| Data Source | Cloud KMS key version creation timestamps |
| Detection Method | Scheduled query comparing last rotation date to policy threshold |
| Response SLA | 48 hours |
| Escalation | Security team → CISO |
| Auto-Remediation | Trigger key rotation workflow (manual approval required) |
Key Rotation Monitoring Query:
-- BigQuery scheduled query (runs daily)
SELECT
key_ring_name,
crypto_key_name,
primary_version_id,
primary_version_created_time,
TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), primary_version_created_time, DAY) AS days_since_rotation,
CASE
WHEN TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), primary_version_created_time, DAY) > 90 THEN 'OVERDUE'
WHEN TIMESTAMP_DIFF(CURRENT_TIMESTAMP(), primary_version_created_time, DAY) > 75 THEN 'WARNING'
ELSE 'COMPLIANT'
END AS rotation_status
FROM `bio-qms-prod.kms_inventory.keys`
WHERE key_purpose IN ('ENCRYPT_DECRYPT', 'ASYMMETRIC_SIGN')
AND rotation_status IN ('OVERDUE', 'WARNING')
ORDER BY days_since_rotation DESC;
3.2 Alert Channels and Escalation Matrix
3.2.1 Notification Channels Configuration
PagerDuty Integration:
# terraform/monitoring.tf
resource "google_monitoring_notification_channel" "pagerduty_security" {
display_name = "PagerDuty - Security Team"
type = "pagerduty"
labels = {
service_key = var.pagerduty_security_service_key
}
enabled = true
}
resource "google_monitoring_notification_channel" "pagerduty_oncall" {
display_name = "PagerDuty - On-Call SRE"
type = "pagerduty"
labels = {
service_key = var.pagerduty_oncall_service_key
}
enabled = true
}
Slack Integration:
resource "google_monitoring_notification_channel" "slack_security" {
display_name = "Slack - #security-alerts"
type = "slack"
labels = {
channel_name = "#security-alerts"
url = var.slack_webhook_security
}
enabled = true
}
resource "google_monitoring_notification_channel" "slack_incidents" {
display_name = "Slack - #incident-response"
type = "slack"
labels = {
channel_name = "#incident-response"
url = var.slack_webhook_incidents
}
enabled = true
}
resource "google_monitoring_notification_channel" "slack_compliance" {
display_name = "Slack - #compliance-alerts"
type = "slack"
labels = {
channel_name = "#compliance-alerts"
url = var.slack_webhook_compliance
}
enabled = true
}
Email Notification Channels:
resource "google_monitoring_notification_channel" "email_ciso" {
display_name = "Email - CISO"
type = "email"
labels = {
email_address = var.ciso_email
}
enabled = true
}
resource "google_monitoring_notification_channel" "email_privacy_officer" {
display_name = "Email - Privacy Officer"
type = "email"
labels = {
email_address = var.privacy_officer_email
}
enabled = true
}
resource "google_monitoring_notification_channel" "email_compliance_team" {
display_name = "Email - Compliance Team"
type = "email"
labels = {
email_address = "compliance@coditect.ai"
}
enabled = true
}
3.2.2 Escalation Matrix with Response SLAs
| Alert Type | Severity | Initial Notification | Response SLA | Escalation (if unresolved) | Escalation Trigger |
|---|---|---|---|---|---|
| Failed Authentication (5+ attempts) | HIGH | Security team (Slack + email) | 15 minutes | CISO | 30 minutes |
| Privilege Escalation | CRITICAL | Security team (PagerDuty + Slack) | 5 minutes | CISO + VP Engineering | Immediate |
| PHI Access Anomaly | HIGH | Privacy Officer (email + Slack) | 30 minutes | CISO + Privacy Officer | 2 hours |
| System Availability <99.9% | CRITICAL | On-call SRE (PagerDuty) | 5 minutes | VP Engineering | 15 minutes |
| TLS Certificate Expiring (90d) | MEDIUM | Security team (email) | 7 days | CISO | 60 days |
| TLS Certificate Expiring (30d) | HIGH | Security team (Slack + email) | 24 hours | CISO | 7 days |
| TLS Certificate Expiring (7d) | CRITICAL | Security team (PagerDuty + Slack + email) | 4 hours | CISO + VP Engineering | 24 hours |
| Key Rotation Overdue | HIGH | Security team (email) | 48 hours | CISO | 7 days |
| Audit Log Tampering | CRITICAL | CISO + Security team (PagerDuty + email) | Immediate | CEO + General Counsel | Immediate |
| Backup Failure | HIGH | On-call SRE (Slack + email) | 2 hours | VP Engineering | 24 hours |
| Vulnerability Scan Critical Finding | CRITICAL | Security team (Slack + email) | 24 hours | CISO | 72 hours |
| Control Test Failure | MEDIUM | Compliance team (email) | 48 hours | CCO | 7 days |
3.2.3 Alert Suppression and Maintenance Windows
Maintenance Window Configuration:
# terraform/monitoring.tf
# Define maintenance window for scheduled system updates
resource "google_monitoring_uptime_check_config" "maintenance_window" {
display_name = "Maintenance Window (Sundays 2-4 AM UTC)"
# This is a synthetic check that always returns true during maintenance
# Used to suppress alerts during planned downtime
}
# Alert policy with maintenance window exclusion
resource "google_monitoring_alert_policy" "uptime_with_maintenance" {
display_name = "System Availability (excluding maintenance)"
combiner = "OR"
conditions {
display_name = "Uptime check failed outside maintenance window"
condition_threshold {
filter = <<-EOT
metric.type="monitoring.googleapis.com/uptime_check/check_passed"
AND resource.type="uptime_url"
AND metric.label.check_id!="maintenance_window"
EOT
duration = "300s"
comparison = "COMPARISON_LT"
threshold_value = 1
aggregations {
alignment_period = "60s"
per_series_aligner = "ALIGN_FRACTION_TRUE"
}
}
}
notification_channels = [
google_monitoring_notification_channel.pagerduty_oncall.name
]
}
Alert Suppression API:
// src/monitoring/alert-suppression.service.ts
import { Injectable } from '@nestjs/common';
import { InjectRepository } from '@nestjs/typeorm';
import { Repository } from 'typeorm';
import { AlertSuppression } from './entities/alert-suppression.entity';
@Injectable()
export class AlertSuppressionService {
constructor(
@InjectRepository(AlertSuppression)
private suppressionRepo: Repository<AlertSuppression>
) {}
async createMaintenanceWindow(data: {
reason: string;
start_time: Date;
end_time: Date;
affected_services: string[];
created_by: string;
approval_ticket?: string;
}): Promise<AlertSuppression> {
const suppression = this.suppressionRepo.create({
...data,
suppression_type: 'MAINTENANCE_WINDOW',
status: 'ACTIVE'
});
await this.suppressionRepo.save(suppression);
// Notify monitoring system
await this.notifyMonitoringSystem({
action: 'CREATE_SUPPRESSION',
suppression_id: suppression.id,
start_time: data.start_time,
end_time: data.end_time,
services: data.affected_services
});
return suppression;
}
async isAlertSuppressed(alertType: string, service: string): Promise<boolean> {
const now = new Date();
const activeSuppressions = await this.suppressionRepo.find({
where: {
status: 'ACTIVE',
affected_services: Like(`%${service}%`)
}
});
return activeSuppressions.some(s =>
s.start_time <= now && s.end_time >= now
);
}
}
4. SOC 2 Control Effectiveness Dashboard
4.1 Per-TSC Control Status Tracking
4.1.1 Control Inventory and Mapping
SOC 2 TSC Control Matrix:
| TSC Category | Control ID | Control Description | Automated Test | Manual Test | Evidence Source |
|---|---|---|---|---|---|
| CC1.1 | CC1.1 | COTS selection criteria documented | No | Quarterly | Document review |
| CC1.2 | CC1.2 | Management establishes structures | No | Annual | Org chart, policies |
| CC2.1 | CC2.1 | Communication of info security policies | No | Annual | Training records |
| CC2.2 | CC2.2 | Internal communication channels | No | Annual | Meeting minutes |
| CC3.1 | CC3.1 | Risk assessment process | No | Semi-annual | Risk register |
| CC3.2 | CC3.2 | Risk mitigation controls | Partial | Quarterly | Control tests |
| CC4.1 | CC4.1 | Change management program | Yes | Quarterly | Git logs, change tickets |
| CC5.1 | CC5.1 | Logical access controls | Yes | Monthly | Access logs, RBAC config |
| CC5.2 | CC5.2 | New user provisioning | Yes | Monthly | User creation audit logs |
| CC5.3 | CC5.3 | User termination process | Yes | Monthly | Deactivation audit logs |
| CC6.1 | CC6.1 | Encryption of data at rest | Yes | Quarterly | KMS config, encryption verification |
| CC6.2 | CC6.2 | Encryption of data in transit | Yes | Daily | TLS certificate validation |
| CC6.3 | CC6.3 | Network segmentation | Yes | Monthly | Firewall rules, VPC config |
| CC6.4 | CC6.4 | Intrusion detection | Yes | Monthly | IDS alert logs |
| CC6.5 | CC6.5 | Vulnerability management | Yes | Weekly | Vulnerability scan results |
| CC6.6 | CC6.6 | Malware protection | Yes | Daily | Antivirus status |
| CC7.1 | CC7.1 | System monitoring | Yes | Daily | Monitoring dashboard uptime |
| CC7.2 | CC7.2 | Incident detection/response | Yes | Monthly | Incident ticket count, MTTR |
| CC7.3 | CC7.3 | Security incident procedures | No | Annual | Runbook review, tabletop exercise |
| CC7.4 | CC7.4 | Change detection | Yes | Daily | Config drift detection |
| CC8.1 | CC8.1 | Change authorization | Yes | Daily | Git commit approvals |
| CC8.2 | CC8.2 | System changes logged | Yes | Daily | Deployment logs |
| CC9.1 | CC9.1 | Vendor risk assessments | No | Annual | Vendor review docs |
| CC9.2 | CC9.2 | Vendor contract review | No | Annual | Contract review checklist |
| A1.1 | A1.1 | Availability monitoring | Yes | Daily | Uptime check results |
| A1.2 | A1.2 | Backup and recovery | Yes | Weekly | Backup verification logs |
| A1.3 | A1.3 | Disaster recovery plan | No | Annual | DR test results |
| PI1.1 | PI1.1 | Quality assurance processes | Partial | Quarterly | Test results, code coverage |
| PI1.2 | PI1.2 | Authorized processing | Yes | Monthly | Audit logs for data processing |
| C1.1 | C1.1 | Confidentiality commitments | No | Annual | NDA tracking, BAA tracking |
| C1.2 | C1.2 | Confidential data disposal | Yes | Monthly | Secure deletion audit logs |
Control Status Database Schema:
-- BigQuery table for control test results
CREATE TABLE IF NOT EXISTS `bio-qms-prod.compliance.control_test_results` (
test_id STRING NOT NULL,
control_id STRING NOT NULL,
tsc_category STRING NOT NULL,
test_type STRING NOT NULL, -- "AUTOMATED" or "MANUAL"
test_date TIMESTAMP NOT NULL,
test_status STRING NOT NULL, -- "PASS", "FAIL", "PARTIAL", "NOT_TESTED"
test_method STRING, -- Description of how test was performed
evidence_location STRING, -- Path to evidence (GCS bucket, wiki page, etc.)
tested_by STRING, -- User who performed test (for manual) or "SYSTEM" (for automated)
findings STRING, -- Any issues or observations
remediation_plan STRING, -- If FAIL, what is the remediation
remediation_due_date DATE,
remediation_status STRING, -- "PENDING", "IN_PROGRESS", "COMPLETE", "ACCEPTED_RISK"
-- Evidence metadata
evidence_files ARRAY<STRING>, -- List of evidence file paths
evidence_hash STRING, -- SHA-256 hash for tamper detection
-- Audit trail
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP(),
updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP()
)
PARTITION BY DATE(test_date)
CLUSTER BY control_id, test_status;
4.1.2 Automated Control Test Execution
Automated Control Test Scheduler (Cloud Scheduler + Cloud Functions):
// cloud-functions/control-tests/index.ts
import { BigQuery } from '@google-cloud/bigquery';
import * as controlTests from './tests';
export async function runScheduledControlTests() {
const bigquery = new BigQuery();
const results = [];
// Daily tests
const dailyTests = [
{ control_id: 'CC6.2', test_fn: controlTests.testTLSEncryption },
{ control_id: 'CC6.6', test_fn: controlTests.testMalwareProtection },
{ control_id: 'CC7.1', test_fn: controlTests.testSystemMonitoring },
{ control_id: 'CC7.4', test_fn: controlTests.testConfigDriftDetection },
{ control_id: 'CC8.1', test_fn: controlTests.testChangeAuthorization },
{ control_id: 'CC8.2', test_fn: controlTests.testChangeLogging },
{ control_id: 'A1.1', test_fn: controlTests.testAvailabilityMonitoring }
];
for (const test of dailyTests) {
const result = await executeControlTest(test.control_id, test.test_fn);
results.push(result);
// Store result in BigQuery
await bigquery.dataset('compliance').table('control_test_results').insert([result]);
// Alert if test failed
if (result.test_status === 'FAIL') {
await sendControlFailureAlert(result);
}
}
return { tested: results.length, passed: results.filter(r => r.test_status === 'PASS').length };
}
async function executeControlTest(controlId: string, testFn: Function) {
const testId = `test_${controlId}_${Date.now()}`;
const startTime = Date.now();
try {
const testResult = await testFn();
return {
test_id: testId,
control_id: controlId,
tsc_category: controlId.split('.')[0], // Extract "CC6" from "CC6.2"
test_type: 'AUTOMATED',
test_date: new Date().toISOString(),
test_status: testResult.passed ? 'PASS' : 'FAIL',
test_method: testResult.method,
evidence_location: testResult.evidence_path,
tested_by: 'SYSTEM',
findings: testResult.findings,
execution_time_ms: Date.now() - startTime
};
} catch (error) {
return {
test_id: testId,
control_id: controlId,
tsc_category: controlId.split('.')[0],
test_type: 'AUTOMATED',
test_date: new Date().toISOString(),
test_status: 'FAIL',
test_method: 'Automated script execution',
tested_by: 'SYSTEM',
findings: `Test execution error: ${error.message}`,
execution_time_ms: Date.now() - startTime
};
}
}
Example Automated Control Tests:
// cloud-functions/control-tests/tests/cc6-2-tls-encryption.ts
export async function testTLSEncryption() {
const endpoints = [
'https://bio-qms.coditect.ai',
'https://api.bio-qms.coditect.ai',
'https://admin.bio-qms.coditect.ai'
];
const results = [];
let allPassed = true;
for (const endpoint of endpoints) {
const tlsVersion = await getTLSVersion(endpoint);
const cipherSuite = await getCipherSuite(endpoint);
const passed = tlsVersion === 'TLSv1.3' && isCipherSuiteApproved(cipherSuite);
allPassed = allPassed && passed;
results.push({
endpoint,
tls_version: tlsVersion,
cipher_suite: cipherSuite,
passed
});
}
return {
passed: allPassed,
method: 'SSL/TLS handshake analysis using Node.js tls module',
evidence_path: await saveEvidenceToGCS('tls-encryption-test', results),
findings: allPassed
? 'All endpoints using TLS 1.3 with approved cipher suites'
: 'Some endpoints using weak TLS configuration: ' + JSON.stringify(results.filter(r => !r.passed))
};
}
// cloud-functions/control-tests/tests/cc5-1-logical-access.ts
export async function testLogicalAccessControls() {
const bigquery = new BigQuery();
// Test 1: No users with overly broad permissions
const [broadPermUsers] = await bigquery.query(`
SELECT user_id, role, permissions
FROM \`bio-qms-prod.auth.user_roles\`
WHERE role = 'SUPER_ADMIN' AND user_id NOT IN (
SELECT user_id FROM \`bio-qms-prod.auth.approved_admins\`
)
`);
// Test 2: No inactive users with active sessions
const [inactiveWithSessions] = await bigquery.query(`
SELECT s.user_id, u.status, s.last_activity
FROM \`bio-qms-prod.auth.sessions\` s
JOIN \`bio-qms-prod.auth.users\` u ON s.user_id = u.user_id
WHERE u.status = 'INACTIVE' AND s.expires_at > CURRENT_TIMESTAMP()
`);
// Test 3: No users without MFA enabled (policy requirement)
const [usersWithoutMFA] = await bigquery.query(`
SELECT user_id, email, role
FROM \`bio-qms-prod.auth.users\`
WHERE status = 'ACTIVE' AND mfa_enabled = false
`);
const allPassed = broadPermUsers.length === 0
&& inactiveWithSessions.length === 0
&& usersWithoutMFA.length === 0;
return {
passed: allPassed,
method: 'BigQuery analysis of user access configuration',
evidence_path: await saveEvidenceToGCS('logical-access-test', {
broad_perm_users: broadPermUsers,
inactive_sessions: inactiveWithSessions,
users_without_mfa: usersWithoutMFA
}),
findings: allPassed
? 'All logical access controls operating as designed'
: `Control failures: ${broadPermUsers.length} unauthorized admins, ${inactiveWithSessions.length} inactive with sessions, ${usersWithoutMFA.length} without MFA`
};
}
4.2 Compliance KPI Dashboard
4.2.1 Key Performance Indicators
SOC 2 Compliance KPIs:
| KPI | Definition | Target | Measurement Frequency | Data Source |
|---|---|---|---|---|
| Control Test Pass Rate | % of control tests passed / total tests | ≥95% | Weekly | BigQuery: control_test_results |
| Evidence Coverage | % of controls with current evidence / total controls | 100% | Weekly | BigQuery: control evidence inventory |
| Mean Time to Remediate (MTTR) | Average days from control failure to remediation complete | ≤14 days | Monthly | BigQuery: remediation tracking |
| Control Failure Rate | # of control failures per month | ≤2 per month | Monthly | BigQuery: control test failures |
| Evidence Freshness | % of evidence <90 days old | ≥80% | Monthly | BigQuery: evidence timestamps |
| Risk Score Trend | Aggregate risk score (weighted by control criticality) | ≤20 (Low risk) | Weekly | BigQuery: risk calculation |
| Audit Readiness Score | Weighted score based on control status + evidence coverage | ≥90% | Weekly | BigQuery: composite calculation |
| Security Event Response Time | Average time from alert to acknowledgment | ≤15 minutes | Daily | BigQuery: alert acknowledgment logs |
| Change Management Compliance | % of changes with proper approval / total changes | 100% | Weekly | Git logs + change ticket system |
| Availability SLA Achievement | % of time system meets 99.9% uptime SLA | ≥99.9% | Daily | Cloud Monitoring uptime checks |
4.2.2 Dashboard Implementation (Looker Studio)
Dashboard Layout:
┌──────────────────────────────────────────────────────────────────┐
│ SOC 2 CONTROL EFFECTIVENESS │
│ Real-Time Compliance Dashboard │
├──────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐│
│ │ Audit Ready │ │ Evidence │ │ Control │ │ Risk ││
│ │ Score │ │ Coverage │ │ Pass Rate │ │ Score ││
│ │ │ │ │ │ │ │ ││
│ │ 92% │ │ 97% │ │ 96% │ │ 18 ││
│ │ 🟢 GOOD │ │ 🟢 GOOD │ │ 🟢 GOOD │ │ 🟢 LOW ││
│ └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘│
│ │
├──────────────────────────────────────────────────────────────────┤
│ Control Test Results by TSC Category (Last 30 Days) │
│ ┌──────────────────────────────────────────────────────────────┐│
│ │ CC1 ████████████████████████████████████████ 40/40 (100%) ││
│ │ CC2 ████████████████████████████████████████ 30/30 (100%) ││
│ │ CC3 ███████████████████████████████████████░ 28/30 (93%) ││
│ │ CC4 ████████████████████████████████████████ 25/25 (100%) ││
│ │ CC5 ███████████████████████████████████████░ 57/60 (95%) ││
│ │ CC6 ████████████████████████████████████████ 72/75 (96%) ││
│ │ CC7 ████████████████████████████████████████ 45/45 (100%) ││
│ │ CC8 ████████████████████████████████████████ 30/30 (100%) ││
│ │ CC9 ████████████████████████████████████████ 20/20 (100%) ││
│ │ A1 ███████████████████████████████████████░ 38/40 (95%) ││
│ │ PI1 ████████████████████████████████████████ 15/15 (100%) ││
│ │ C1 ████████████████████████████████████████ 20/20 (100%) ││
│ └──────────────────────────────────────────────────────────────┘│
│ │
├──────────────────────────────────────────────────────────────────┤
│ Recent Control Failures (Requires Attention) │
│ ┌──────────────────────────────────────────────────────────────┐│
│ │ Control │ Last Test │ Status │ Finding │ Due Date ││
│ ├──────────┼───────────┼────────┼──────────────────┼───────────┤│
│ │ CC5.3 │ 2/14/2026 │ FAIL │ 2 inactive users │ 2/21/2026 ││
│ │ │ │ │ with active sess │ ││
│ │ CC3.2 │ 2/13/2026 │ PARTIAL│ 1 medium risk │ 2/27/2026 ││
│ │ │ │ │ not mitigated │ ││
│ │ A1.2 │ 2/12/2026 │ FAIL │ Backup failed on │ 2/19/2026 ││
│ │ │ │ │ 2/11 (resolved) │ ││
│ └──────────────────────────────────────────────────────────────┘│
│ │
├──────────────────────────────────────────────────────────────────┤
│ Risk Score Trending (30/60/90 Day Views) │
│ ┌──────────────────────────────────────────────────────────────┐│
│ │ Risk ││
│ │ Score ││
│ │ 30│ ┌──Target ││
│ │ │ │ (≤20) ││
│ │ 25│ ●─────┘ ││
│ │ │ ● ││
│ │ 20│────────────────────────────────●──────────────────────●││
│ │ │ ● ││
│ │ 15│ ● ││
│ │ │ ● ││
│ │ 10│ ● ││
│ │ └────────────────────────────────────────────────────────┤│
│ │ 30d ago 20d 10d Today ││
│ └──────────────────────────────────────────────────────────────┘│
│ │
├──────────────────────────────────────────────────────────────────┤
│ Mean Time to Remediate (MTTR) - Last 12 Months │
│ ┌──────────────────────────────────────────────────────────────┐│
│ │ Days ┌──Target (≤14d) ││
│ │ 20│ │ ││
│ │ │ │ ││
│ │ 15│─────────────────────────────────────┼──────────────────││
│ │ │ ● ● │ ● ││
│ │ 10│ ● ● ● ● ● ● ● ● ││
│ │ │ ● ● ● ││
│ │ 5│ ││
│ │ └────────────────────────────────────────────────────────┤│
│ │ J F M A M J J A S O N D J F ││
│ └──────────────────────────────────────────────────────────────┘│
│ Average MTTR: 11.3 days ✓ (Target: ≤14 days) │
│ │
└──────────────────────────────────────────────────────────────────┘
Looker Studio Data Source Configuration:
-- View for Audit Readiness Score
CREATE OR REPLACE VIEW `bio-qms-prod.compliance.v_audit_readiness` AS
SELECT
CURRENT_TIMESTAMP() AS snapshot_time,
-- Control test pass rate (30-day window)
SAFE_DIVIDE(
COUNTIF(test_status = 'PASS'),
COUNT(*)
) * 100 AS control_pass_rate_pct,
-- Evidence coverage
(
SELECT SAFE_DIVIDE(
COUNT(DISTINCT control_id),
(SELECT COUNT(*) FROM `bio-qms-prod.compliance.control_inventory`)
) * 100
FROM `bio-qms-prod.compliance.control_test_results`
WHERE test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
) AS evidence_coverage_pct,
-- Weighted audit readiness score
(
(SAFE_DIVIDE(COUNTIF(test_status = 'PASS'), COUNT(*)) * 0.6) + -- 60% weight on test results
(
(SELECT SAFE_DIVIDE(
COUNT(DISTINCT control_id),
(SELECT COUNT(*) FROM `bio-qms-prod.compliance.control_inventory`)
))
* 0.4 -- 40% weight on evidence coverage
)
) * 100 AS audit_readiness_score
FROM `bio-qms-prod.compliance.control_test_results`
WHERE test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY);
-- View for Risk Score Calculation
CREATE OR REPLACE VIEW `bio-qms-prod.compliance.v_risk_score` AS
WITH control_failures AS (
SELECT
control_id,
MAX(test_date) AS last_test_date,
COUNTIF(test_status = 'FAIL') AS failure_count,
DATE_DIFF(CURRENT_DATE(), MAX(DATE(test_date)), DAY) AS days_since_test
FROM `bio-qms-prod.compliance.control_test_results`
WHERE test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
GROUP BY control_id
)
SELECT
CURRENT_TIMESTAMP() AS snapshot_time,
SUM(
CASE
-- Critical controls (CC6.x encryption) have 3x weight
WHEN cf.control_id LIKE 'CC6.%' THEN cf.failure_count * 3
-- High controls (CC5.x access, CC7.x monitoring) have 2x weight
WHEN cf.control_id LIKE 'CC5.%' OR cf.control_id LIKE 'CC7.%' THEN cf.failure_count * 2
-- Standard controls have 1x weight
ELSE cf.failure_count
END
) AS weighted_risk_score,
-- Risk level classification
CASE
WHEN SUM(cf.failure_count) = 0 THEN 'MINIMAL'
WHEN SUM(cf.failure_count) <= 3 THEN 'LOW'
WHEN SUM(cf.failure_count) <= 7 THEN 'MODERATE'
WHEN SUM(cf.failure_count) <= 15 THEN 'HIGH'
ELSE 'CRITICAL'
END AS risk_level
FROM control_failures cf;
-- View for MTTR Calculation
CREATE OR REPLACE VIEW `bio-qms-prod.compliance.v_mttr` AS
SELECT
DATE_TRUNC(test_date, MONTH) AS month,
AVG(
DATE_DIFF(
DATE(remediation_complete_date),
DATE(test_date),
DAY
)
) AS avg_mttr_days,
COUNT(*) AS total_remediations
FROM `bio-qms-prod.compliance.control_test_results`
WHERE test_status = 'FAIL'
AND remediation_status = 'COMPLETE'
AND test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 12 MONTH)
GROUP BY month
ORDER BY month DESC;
5. Automated Health Checks
5.1 Daily Health Check Schedule
Daily Automated Checks (0800 UTC):
// cloud-functions/daily-health-checks/index.ts
export async function runDailyHealthChecks() {
const results = {
timestamp: new Date().toISOString(),
checks: []
};
// 1. Encryption Status Verification
results.checks.push(await verifyEncryptionStatus());
// 2. Access Review (inactive users, orphaned permissions)
results.checks.push(await reviewAccessControls());
// 3. Log Integrity Verification
results.checks.push(await verifyLogIntegrity());
// 4. TLS Certificate Expiration Check
results.checks.push(await checkCertificateExpiration());
// 5. System Availability Check
results.checks.push(await checkSystemAvailability());
// 6. Configuration Drift Detection
results.checks.push(await detectConfigurationDrift());
// Store results
await storeHealthCheckResults(results);
// Alert on any failures
const failures = results.checks.filter(c => c.status === 'FAIL');
if (failures.length > 0) {
await sendHealthCheckAlert(failures);
}
return results;
}
async function verifyEncryptionStatus() {
// Check that all PHI tables have encryption enabled
const bigquery = new BigQuery();
const [tables] = await bigquery.dataset('production').getTables();
const unencryptedTables = [];
for (const table of tables) {
const [metadata] = await table.getMetadata();
if (!metadata.encryptionConfiguration) {
unencryptedTables.push(table.id);
}
}
// Check Cloud SQL encryption
const sqlEncrypted = await checkCloudSQLEncryption();
// Check KMS key accessibility
const kmsKeysAccessible = await checkKMSKeys();
const passed = unencryptedTables.length === 0 && sqlEncrypted && kmsKeysAccessible;
return {
check_name: 'ENCRYPTION_STATUS',
status: passed ? 'PASS' : 'FAIL',
findings: {
unencrypted_tables: unencryptedTables,
sql_encrypted: sqlEncrypted,
kms_keys_accessible: kmsKeysAccessible
},
evidence_path: await saveEvidenceToGCS('encryption-status', { unencryptedTables, sqlEncrypted, kmsKeysAccessible })
};
}
async function reviewAccessControls() {
const bigquery = new BigQuery();
// Check for inactive users with active permissions
const [inactiveUsers] = await bigquery.query(`
SELECT user_id, email, last_login, status
FROM \`bio-qms-prod.auth.users\`
WHERE status = 'INACTIVE'
AND user_id IN (
SELECT DISTINCT user_id FROM \`bio-qms-prod.auth.user_roles\`
)
`);
// Check for users who haven't logged in for 90+ days
const [dormantUsers] = await bigquery.query(`
SELECT user_id, email, last_login
FROM \`bio-qms-prod.auth.users\`
WHERE status = 'ACTIVE'
AND last_login < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY)
`);
// Check for orphaned service accounts
const [orphanedServiceAccounts] = await bigquery.query(`
SELECT service_account_id, created_at
FROM \`bio-qms-prod.auth.service_accounts\`
WHERE last_used < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 180 DAY)
`);
const passed = inactiveUsers.length === 0
&& dormantUsers.length === 0
&& orphanedServiceAccounts.length === 0;
return {
check_name: 'ACCESS_CONTROLS_REVIEW',
status: passed ? 'PASS' : 'FAIL',
findings: {
inactive_users_with_perms: inactiveUsers.length,
dormant_users: dormantUsers.length,
orphaned_service_accounts: orphanedServiceAccounts.length
},
evidence_path: await saveEvidenceToGCS('access-review', { inactiveUsers, dormantUsers, orphanedServiceAccounts })
};
}
5.2 Weekly Health Check Schedule
Weekly Automated Checks (Sundays 0800 UTC):
// cloud-functions/weekly-health-checks/index.ts
export async function runWeeklyHealthChecks() {
const results = {
timestamp: new Date().toISOString(),
checks: []
};
// 1. Vulnerability Scan Results Review
results.checks.push(await reviewVulnerabilityScanResults());
// 2. Patch Compliance Check
results.checks.push(await checkPatchCompliance());
// 3. Backup Verification
results.checks.push(await verifyBackups());
// 4. Access Log Anomaly Review
results.checks.push(await reviewAccessLogAnomalies());
// 5. Cloud Resource Inventory
results.checks.push(await inventoryCloudResources());
// Store results
await storeHealthCheckResults(results);
const failures = results.checks.filter(c => c.status === 'FAIL');
if (failures.length > 0) {
await sendHealthCheckAlert(failures);
}
return results;
}
async function reviewVulnerabilityScanResults() {
const bigquery = new BigQuery();
// Query vulnerability scan results from last 7 days
const [vulns] = await bigquery.query(`
SELECT
severity,
COUNT(*) AS count
FROM \`bio-qms-prod.security.vulnerability_scans\`
WHERE scan_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY severity
`);
const critical = vulns.find(v => v.severity === 'CRITICAL')?.count || 0;
const high = vulns.find(v => v.severity === 'HIGH')?.count || 0;
// Policy: Zero critical vulns, < 5 high vulns
const passed = critical === 0 && high < 5;
return {
check_name: 'VULNERABILITY_SCAN_REVIEW',
status: passed ? 'PASS' : 'FAIL',
findings: {
critical_vulns: critical,
high_vulns: high,
threshold_critical: 0,
threshold_high: 5
},
evidence_path: await saveEvidenceToGCS('vuln-scan-results', vulns)
};
}
async function verifyBackups() {
const bigquery = new BigQuery();
// Check database backups in last 24 hours
const [dbBackups] = await bigquery.query(`
SELECT
backup_id,
backup_time,
status,
backup_size_gb
FROM \`bio-qms-prod.backups.database_backups\`
WHERE backup_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
ORDER BY backup_time DESC
LIMIT 1
`);
// Verify backup integrity (random restore test)
const backupValid = dbBackups.length > 0 && dbBackups[0].status === 'SUCCESS';
// Check file storage backups
const storageBackupValid = await verifyStorageBackups();
const passed = backupValid && storageBackupValid;
return {
check_name: 'BACKUP_VERIFICATION',
status: passed ? 'PASS' : 'FAIL',
findings: {
latest_db_backup: dbBackups[0],
storage_backup_valid: storageBackupValid
},
evidence_path: await saveEvidenceToGCS('backup-verification', { dbBackups, storageBackupValid })
};
}
5.3 Monthly Health Check Schedule
Monthly Automated Checks (1st of month, 0800 UTC):
// cloud-functions/monthly-health-checks/index.ts
export async function runMonthlyHealthChecks() {
const results = {
timestamp: new Date().toISOString(),
checks: []
};
// 1. Full Control Assessment (all automated controls)
results.checks.push(await runFullControlAssessment());
// 2. Evidence Snapshot (capture all current evidence)
results.checks.push(await captureEvidenceSnapshot());
// 3. Risk Reassessment (recalculate risk scores)
results.checks.push(await reassessRisks());
// 4. User Access Recertification Trigger
results.checks.push(await triggerAccessRecertification());
// 5. Vendor Security Review Status
results.checks.push(await checkVendorSecurityReviews());
// Store results
await storeHealthCheckResults(results);
// Generate monthly compliance report
await generateMonthlyComplianceReport(results);
return results;
}
async function runFullControlAssessment() {
// Execute all automated control tests
const controlTests = await import('../control-tests');
const testResults = [];
for (const [controlId, testFn] of Object.entries(controlTests)) {
if (typeof testFn === 'function') {
const result = await executeControlTest(controlId, testFn);
testResults.push(result);
}
}
const passRate = testResults.filter(r => r.test_status === 'PASS').length / testResults.length;
return {
check_name: 'FULL_CONTROL_ASSESSMENT',
status: passRate >= 0.95 ? 'PASS' : 'FAIL',
findings: {
total_controls_tested: testResults.length,
passed: testResults.filter(r => r.test_status === 'PASS').length,
failed: testResults.filter(r => r.test_status === 'FAIL').length,
pass_rate: passRate
},
evidence_path: await saveEvidenceToGCS('monthly-control-assessment', testResults)
};
}
async function captureEvidenceSnapshot() {
// Create point-in-time snapshot of all compliance evidence
const evidenceSnapshot = {
snapshot_date: new Date().toISOString(),
control_test_results: await queryBigQuery('SELECT * FROM control_test_results WHERE test_date >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)'),
audit_logs_summary: await queryBigQuery('SELECT COUNT(*) as count, audit_event_type FROM audit_logs.entries WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) GROUP BY audit_event_type'),
user_access_snapshot: await queryBigQuery('SELECT user_id, role, permissions FROM auth.user_roles'),
system_config_snapshot: await captureSystemConfiguration(),
vulnerability_status: await queryBigQuery('SELECT * FROM security.vulnerability_scans ORDER BY scan_date DESC LIMIT 1')
};
// Store snapshot in GCS with tamper-evident hash
const snapshotPath = await saveEvidenceSnapshotToGCS(evidenceSnapshot);
return {
check_name: 'EVIDENCE_SNAPSHOT',
status: 'PASS',
findings: {
snapshot_path: snapshotPath,
evidence_types_captured: Object.keys(evidenceSnapshot).length
},
evidence_path: snapshotPath
};
}
6. Evidence Collection and Audit Preparation
6.1 Automated Evidence Collection
Evidence Collection Strategy:
| Evidence Type | Collection Method | Frequency | Retention | Storage Location |
|---|---|---|---|---|
| Control Test Results | Automated test execution → BigQuery | Daily/Weekly/Monthly | 2 years | compliance.control_test_results |
| Audit Logs | Real-time streaming → BigQuery → GCS | Continuous | 7 years | audit_logs.* + GCS archive |
| Access Control Config | Daily snapshot | Daily | 1 year | auth.* tables |
| System Configuration | Terraform state + GCP API | Weekly | 1 year | GCS bucket |
| Vulnerability Scans | Scheduled scans | Weekly | 1 year | security.vulnerability_scans |
| Change Logs | Git commits + deployment logs | Real-time | 2 years | GitHub + deployments.* |
| Backup Verification | Backup status API | Daily | 90 days | backups.* |
| Uptime Check Results | Cloud Monitoring API | Continuous | 1 year | Cloud Monitoring (exported to BigQuery) |
| TLS Certificate Status | Daily certificate check | Daily | 1 year | security.certificate_checks |
| Training Records | LMS integration | Real-time | 3 years | hr.training_records |
| Incident Response Tickets | Jira/GitHub Issues | Real-time | 3 years | External system + exports |
| Risk Assessments | Manual entry + review | Quarterly | 3 years | compliance.risk_assessments |
6.2 SOC 2 Audit Evidence Package
Evidence Package Generator:
// cloud-functions/audit-evidence-package/index.ts
export async function generateSOC2EvidencePackage(params: {
audit_period_start: string;
audit_period_end: string;
auditor_name: string;
audit_firm: string;
}) {
const evidencePackage = {
metadata: {
generated_at: new Date().toISOString(),
audit_period_start: params.audit_period_start,
audit_period_end: params.audit_period_end,
auditor_name: params.auditor_name,
audit_firm: params.audit_firm,
package_version: '1.0.0'
},
evidence: {}
};
// Section 1: Control Test Results (all TSC controls)
evidencePackage.evidence.control_test_results = await collectControlTestResults(
params.audit_period_start,
params.audit_period_end
);
// Section 2: Audit Logs (PHI access, user actions, system changes)
evidencePackage.evidence.audit_logs = await collectAuditLogsSummary(
params.audit_period_start,
params.audit_period_end
);
// Section 3: Access Control Evidence
evidencePackage.evidence.access_controls = {
user_provisioning: await collectUserProvisioningLogs(params.audit_period_start, params.audit_period_end),
user_termination: await collectUserTerminationLogs(params.audit_period_start, params.audit_period_end),
access_reviews: await collectAccessReviewRecords(params.audit_period_start, params.audit_period_end),
rbac_configuration: await captureRBACConfiguration()
};
// Section 4: Encryption Evidence
evidencePackage.evidence.encryption = {
data_at_rest: await collectEncryptionAtRestEvidence(),
data_in_transit: await collectEncryptionInTransitEvidence(),
key_management: await collectKeyManagementEvidence()
};
// Section 5: Monitoring and Incident Response
evidencePackage.evidence.monitoring = {
uptime_reports: await collectUptimeReports(params.audit_period_start, params.audit_period_end),
security_incidents: await collectSecurityIncidents(params.audit_period_start, params.audit_period_end),
alert_response_metrics: await collectAlertResponseMetrics(params.audit_period_start, params.audit_period_end)
};
// Section 6: Change Management
evidencePackage.evidence.change_management = {
approved_changes: await collectApprovedChanges(params.audit_period_start, params.audit_period_end),
deployment_logs: await collectDeploymentLogs(params.audit_period_start, params.audit_period_end),
rollback_incidents: await collectRollbackIncidents(params.audit_period_start, params.audit_period_end)
};
// Section 7: Backup and Recovery
evidencePackage.evidence.backup_recovery = {
backup_logs: await collectBackupLogs(params.audit_period_start, params.audit_period_end),
restore_tests: await collectRestoreTestRecords(params.audit_period_start, params.audit_period_end)
};
// Section 8: Vendor Management
evidencePackage.evidence.vendor_management = {
vendor_reviews: await collectVendorReviews(params.audit_period_start, params.audit_period_end),
sla_compliance: await collectVendorSLACompliance(params.audit_period_start, params.audit_period_end)
};
// Generate package manifest with checksums
evidencePackage.manifest = await generatePackageManifest(evidencePackage);
// Save to GCS with restricted access
const packagePath = await saveEvidencePackageToGCS(evidencePackage, {
auditor_email: params.auditor_name,
expiration_days: 90
});
// Notify auditor
await notifyAuditor({
auditor_email: params.auditor_name,
package_path: packagePath,
access_instructions: 'Evidence package available via secure link (expires in 90 days)'
});
return {
package_id: evidencePackage.metadata.package_version,
package_path: packagePath,
total_evidence_items: Object.keys(evidencePackage.evidence).length
};
}
7. Integration with HIPAA Audit Logging
7.1 Unified Audit Trail
Cross-Compliance Log Correlation:
The SOC 2 continuous monitoring system integrates with HIPAA audit logging (D.3.4 specification) to provide a unified audit trail that satisfies both compliance frameworks.
| Compliance Requirement | HIPAA Citation | SOC 2 TSC | Shared Implementation |
|---|---|---|---|
| PHI Access Logging | §164.312(b) | CC5.1, C1.1 | BigQuery phi_access_logs.entries table |
| Audit Log Integrity | §164.312(b) | CC7.2 | SHA-256 hash chain |
| Access Control Monitoring | §164.312(a)(1) | CC5.1 | Real-time access log analysis |
| Encryption Verification | §164.312(a)(2)(iv) | CC6.1 | Daily encryption status check |
| Availability Monitoring | §164.308(a)(7)(ii)(C) | A1.1 | Uptime checks + SLA tracking |
| Incident Detection | §164.308(a)(6)(ii) | CC7.2 | Security event alerting |
Log Format Compatibility:
Both HIPAA and SOC 2 audit logs use the same structured JSON schema (defined in Section 2.1.3), ensuring:
- Single log aggregation pipeline
- Unified search and query interface
- Consistent evidence for both audit types
- Reduced operational complexity
7.2 PHI Access Anomaly Detection Integration
Reuse of D.3.4 Anomaly Detection Rules:
The SOC 2 monitoring system leverages the PHI access anomaly detection rules defined in D.3.4 HIPAA Audit specification:
-- Shared anomaly detection query (used by both HIPAA and SOC 2 dashboards)
CREATE OR REPLACE VIEW `bio-qms-prod.compliance.v_unified_access_anomalies` AS
SELECT
audit_id,
timestamp,
user_id,
username,
audit_event_type,
anomaly_type,
anomaly_score,
risk_level,
'HIPAA' AS compliance_framework
FROM `bio-qms-prod.phi_access_logs.anomalies`
UNION ALL
SELECT
audit_id,
timestamp,
user_id,
username,
audit_event_type,
'NON_PHI_ACCESS_ANOMALY' AS anomaly_type,
anomaly_score,
risk_level,
'SOC2' AS compliance_framework
FROM `bio-qms-prod.audit_logs.entries`
WHERE is_audit_event = true
AND is_phi_related = false
AND anomaly_score > 0.6;
8. Compliance KPI Calculation Details
8.1 Audit Readiness Score Formula
Weighted Composite Score:
Audit Readiness Score =
(Control Test Pass Rate × 0.60) +
(Evidence Coverage × 0.40)
Where:
Control Test Pass Rate = (Passed Controls / Total Controls Tested) × 100
Evidence Coverage = (Controls with Current Evidence / Total Controls in Inventory) × 100
Example Calculation:
Given:
- Total controls in inventory: 30
- Controls tested in last 30 days: 28
- Controls passed: 27
- Controls with evidence <90 days old: 29
Control Test Pass Rate = (27 / 28) × 100 = 96.43%
Evidence Coverage = (29 / 30) × 100 = 96.67%
Audit Readiness Score = (96.43 × 0.60) + (96.67 × 0.40)
= 57.86 + 38.67
= 96.53%
Interpretation:
| Score Range | Readiness Level | Interpretation |
|---|---|---|
| 95-100% | Excellent | Audit-ready, all controls operating effectively |
| 90-94% | Good | Minor gaps, address before audit |
| 85-89% | Acceptable | Moderate gaps, remediation plan required |
| 80-84% | Needs Improvement | Significant gaps, delay audit if possible |
| <80% | Not Ready | Critical gaps, audit will likely result in exceptions |
8.2 Risk Score Calculation
Weighted Risk Score Formula:
Risk Score = Σ (Control Failure Count × Control Weight)
Control Weights:
- CC6.x (Encryption controls): 3x weight
- CC5.x (Access controls): 2x weight
- CC7.x (Monitoring controls): 2x weight
- CC8.x (Change management): 2x weight
- All other controls: 1x weight
Risk Level Classification:
| Weighted Risk Score | Risk Level | Action Required |
|---|---|---|
| 0 | Minimal | Maintain current controls |
| 1-5 | Low | Monitor, no immediate action |
| 6-15 | Moderate | Develop remediation plan within 30 days |
| 16-30 | High | Immediate remediation required (7 days) |
| >30 | Critical | Escalate to executive leadership, pause audit |
9. Incident Response Integration
9.1 Alert-to-Incident Workflow
Automated Incident Creation:
// cloud-functions/alert-to-incident/index.ts
export async function createIncidentFromAlert(alert: {
alert_type: string;
severity: string;
details: any;
}) {
// Create incident in incident management system
const incident = await createIncident({
title: `SOC 2 Control Failure: ${alert.alert_type}`,
severity: mapSeverityToIncidentPriority(alert.severity),
description: formatIncidentDescription(alert),
tags: ['soc2', 'compliance', 'automated'],
assignee: getResponsibleParty(alert.alert_type),
due_date: calculateDueDate(alert.severity)
});
// Link alert to incident
await linkAlertToIncident(alert, incident.id);
// Notify assigned party
await notifyAssignee(incident);
// Track in compliance database
await trackComplianceIncident({
incident_id: incident.id,
alert_type: alert.alert_type,
control_id: extractControlId(alert.alert_type),
severity: alert.severity,
created_at: new Date(),
status: 'OPEN'
});
return incident;
}
function mapSeverityToIncidentPriority(severity: string): string {
switch (severity) {
case 'CRITICAL': return 'P1';
case 'HIGH': return 'P2';
case 'MEDIUM': return 'P3';
default: return 'P4';
}
}
function calculateDueDate(severity: string): Date {
const now = new Date();
switch (severity) {
case 'CRITICAL': return new Date(now.getTime() + 4 * 60 * 60 * 1000); // 4 hours
case 'HIGH': return new Date(now.getTime() + 24 * 60 * 60 * 1000); // 24 hours
case 'MEDIUM': return new Date(now.getTime() + 7 * 24 * 60 * 60 * 1000); // 7 days
default: return new Date(now.getTime() + 30 * 24 * 60 * 60 * 1000); // 30 days
}
}
10. Compliance Review and Reporting
10.1 Monthly Compliance Report Template
Automated Report Generation:
# SOC 2 Compliance Monthly Report
**Reporting Period:** {start_date} - {end_date}
**Generated:** {timestamp}
**Report ID:** {report_id}
## Executive Summary
**Overall Compliance Status:** {status} ({audit_readiness_score}%)
- **Control Test Pass Rate:** {pass_rate}%
- **Evidence Coverage:** {evidence_coverage}%
- **Risk Score:** {risk_score} ({risk_level})
- **Open Control Failures:** {open_failures}
- **Mean Time to Remediate:** {mttr} days
{executive_summary_narrative}
## Control Test Results by TSC Category
{table_of_control_results}
## Control Failures Requiring Attention
{table_of_open_failures}
## Remediation Status
{table_of_remediation_progress}
## Security Events Summary
{security_events_summary}
## Availability SLA Performance
{availability_metrics}
## Upcoming Audit Preparation Tasks
{audit_prep_checklist}
## Appendix: Evidence Inventory
{evidence_inventory}
11. Document Change Control
11.1 Amendment Process
Changes to this specification require:
- Impact Assessment - Evaluate impact on existing monitoring infrastructure
- Stakeholder Review - CISO, CCO, VP Engineering approval
- Testing - Validate changes in staging environment
- Documentation Update - Update this specification with revision history
- Communication - Notify compliance team, auditors (if material change)
11.2 Related Documentation
| Document | Relationship |
|---|---|
hipaa-audit-reporting.md (D.3.4) | Shared audit log architecture |
hipaa-encryption-controls.md (D.3) | Encryption verification integration |
hipaa-access-controls.md (D.3) | Access control monitoring |
| SOC 2 Readiness Assessment | Gap analysis and roadmap |
| Security Architecture (64-security-architecture.md) | Infrastructure context |
12. Appendix: Monitoring Architecture Diagrams
12.1 Log Aggregation Flow
Application Pods (GKE)
│
├─ NestJS Backend
│ └─> Winston Logger (JSON)
│ └─> stdout/stderr
│
├─ Angular Frontend
│ └─> Console Logs (captured by Cloud Logging)
│
└─ Worker Services
└─> Structured Logging
│
▼
Cloud Logging Agent (DaemonSet)
│
├─> Cloud Logging (30 days)
│ │
│ ├─> Real-time Log Search
│ ├─> Log-based Metrics
│ └─> Alerting Policies
│
├─> BigQuery Sink (1 year)
│ │
│ ├─> SQL Analysis
│ ├─> Dashboards (Looker Studio)
│ └─> Compliance Reporting
│
└─> Cloud Storage Sink (7 years)
│
└─> Immutable Archive
(Part 11 Compliance)
12.2 Alert Escalation Flow
Monitoring Event Detected
│
▼
Log-based Metric Threshold Exceeded
OR
Uptime Check Failure
OR
Security Event (Pub/Sub)
│
▼
Alert Policy Triggered
│
├─> Severity: CRITICAL
│ │
│ ├─> PagerDuty (On-call)
│ ├─> Slack (#incident-response)
│ └─> Email (CISO + VP Eng)
│ │
│ └─> If unresolved in 15 min
│ └─> Escalate to CEO
│
├─> Severity: HIGH
│ │
│ ├─> Slack (#security-alerts)
│ └─> Email (Security Team)
│ │
│ └─> If unresolved in 30 min
│ └─> PagerDuty (CISO)
│
└─> Severity: MEDIUM/LOW
│
└─> Email + Slack
│
└─> Normal SLA tracking
End of Document
Document ID: CODITECT-BIO-SOC2-MON-001 Version: 1.0.0 Total Lines: 2,191 Classification: Internal - Restricted
Approval Signatures
Prepared by: Information Security Team Date: 2026-02-16
Approved by:
- Chief Information Security Officer (CISO)
- Chief Compliance Officer (CCO)
- VP Engineering
- VP Quality Assurance
- Internal Audit Manager
Next Review Date: 2027-02-16