Track M: Security Operations - Evidence Document

Executive Summary

This document provides comprehensive evidence of the security operations implementation for the BIO-QMS platform, a regulated SaaS Quality Management System compliant with FDA 21 CFR Part 11, HIPAA, and SOC 2 Type II requirements.

Platform Context:

Regulatory Scope: FDA 21 CFR Part 11, HIPAA Privacy & Security Rules, SOC 2 Type II
Deployment: Google Cloud Platform (GKE, Cloud Run, Cloud SQL)
Architecture: Microservices with Django backend, React frontend
Data Classification: Protected Health Information (PHI), Electronic Records (ER)

Security Operations Coverage:

M.1: Application Security Pipeline (SAST, DAST, SCA, Security Gates)
M.2: Vulnerability Management (Scanning, CVSS Prioritization, Penetration Testing)
M.3: Security Incident Response (SIEM, IR Workflows, Forensics, Breach Notification)
M.4: Secrets & Key Management (GCP Secret Manager, Key Rotation, Compliance)
M.5: Security Monitoring & Detection (Threat Detection, SOAR, Operations Dashboard)

M.1: Application Security Pipeline

M.1.1: SAST Integration

Implementation Overview

Static Application Security Testing (SAST) is integrated into the CI/CD pipeline using Semgrep as the primary engine with CodeQL for deep semantic analysis.

Technology Stack:

Primary Engine: Semgrep (open-source, fast, customizable)
Secondary Engine: GitHub CodeQL (deep dataflow analysis)
CI/CD Integration: GitHub Actions, Cloud Build
Custom Rules: Healthcare compliance, FDA 21 CFR Part 11 controls

CI/CD Integration Architecture

# .github/workflows/sast-scan.yml
name: SAST Security Scan

on:
  pull_request:
    branches: [main, develop, release/*]
  push:
    branches: [main, develop]
  schedule:
    - cron: '0 2 * * 1'  # Weekly Monday 2 AM UTC

jobs:
  semgrep-scan:
    name: Semgrep SAST Analysis
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write
      pull-requests: write

    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history for diff analysis

      - name: Run Semgrep
        uses: semgrep/semgrep-action@v1
        with:
          config: >-
            p/security-audit
            p/secrets
            p/owasp-top-ten
            p/python
            p/django
            p/react
            p/typescript
            .semgrep/healthcare-compliance.yml
            .semgrep/fda-21-cfr-part-11.yml
            .semgrep/hipaa-security.yml
        env:
          SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}
          SEMGREP_BASELINE_REF: ${{ github.base_ref }}

      - name: Upload SARIF results
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: semgrep.sarif
        if: always()

      - name: Generate HTML report
        if: always()
        run: |
          semgrep --config=auto --sarif -o semgrep-report.sarif .
          npx @microsoft/sarif-to-html semgrep-report.sarif \
            --output semgrep-report.html

      - name: Upload artifacts
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: sast-reports
          path: |
            semgrep-report.sarif
            semgrep-report.html
            semgrep.json
          retention-days: 90

  codeql-analysis:
    name: CodeQL Deep Analysis
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write

    strategy:
      fail-fast: false
      matrix:
        language: [python, javascript, typescript]

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Initialize CodeQL
        uses: github/codeql-action/init@v3
        with:
          languages: ${{ matrix.language }}
          queries: +security-extended,security-and-quality
          config-file: .github/codeql-config.yml

      - name: Autobuild
        uses: github/codeql-action/autobuild@v3

      - name: Perform CodeQL Analysis
        uses: github/codeql-action/analyze@v3
        with:
          category: "/language:${{ matrix.language }}"
          upload: true
          output: codeql-results

      - name: Filter healthcare-specific findings
        run: |
          python3 .github/scripts/filter-codeql-findings.py \
            --input codeql-results \
            --output codeql-healthcare-findings.json \
            --rules healthcare,phi-handling,electronic-signatures

  dependency-check:
    name: OWASP Dependency Check
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Run OWASP Dependency Check
        uses: dependency-check/Dependency-Check_Action@main
        with:
          project: 'bio-qms-platform'
          path: '.'
          format: 'ALL'
          args: >
            --enableRetired
            --enableExperimental
            --failOnCVSS 7
            --suppression dependency-check-suppressions.xml

      - name: Upload results
        uses: actions/upload-artifact@v4
        with:
          name: dependency-check-report
          path: reports/
          retention-days: 90

Custom Healthcare Compliance Rules

# .semgrep/healthcare-compliance.yml
rules:
  - id: phi-logging-detected
    pattern-either:
      - pattern: logging.$METHOD(..., $PHI, ...)
      - pattern: logger.$METHOD(f"... {$PHI} ...")
      - pattern: print(..., $PHI, ...)
    metavariable-pattern:
      metavariable: $PHI
      patterns:
        - pattern-either:
          - pattern: $X.ssn
          - pattern: $X.social_security_number
          - pattern: $X.medical_record_number
          - pattern: $X.patient_id
          - pattern: $X.diagnosis
          - pattern: $X.prescription
    message: |
      Potential PHI disclosure in logs. Protected Health Information must not
      be logged in plaintext. Use de-identification or redaction.

      Compliance: HIPAA Security Rule § 164.312(a)(2)(i)
      Remediation: Implement log sanitization via RedactedFormatter
    severity: ERROR
    languages: [python]
    metadata:
      category: security
      subcategory: [hipaa, phi-protection]
      cwe: "CWE-532: Insertion of Sensitive Information into Log File"
      owasp: "A09:2021 - Security Logging and Monitoring Failures"
      references:
        - "https://www.hhs.gov/hipaa/for-professionals/security/laws-regulations/index.html"
      likelihood: HIGH
      impact: CRITICAL
      confidence: HIGH

  - id: electronic-signature-missing-validation
    pattern-either:
      - pattern: |
          class $CLASS(...):
            ...
            def sign_document(self, ...):
              ...
    pattern-not: |
      class $CLASS(...):
        ...
        def sign_document(self, ...):
          ...
          self.validate_signature_authority(...)
          ...
    message: |
      Electronic signature implementation missing authority validation.

      FDA 21 CFR Part 11 § 11.50(a) requires signature validation including:
      - Signer identity verification
      - Signing authority verification
      - Intent verification (meaning of signature)

      Implement validate_signature_authority() check before signature creation.
    severity: ERROR
    languages: [python]
    metadata:
      category: security
      subcategory: [fda-21-cfr-part-11, electronic-signatures]
      regulatory: "FDA 21 CFR Part 11.50(a)"
      cwe: "CWE-287: Improper Authentication"

  - id: audit-trail-missing-timestamp
    pattern: |
      class AuditLog(...):
        ...
    pattern-not: |
      class AuditLog(...):
        ...
        timestamp = models.DateTimeField(auto_now_add=True)
        ...
    message: |
      Audit trail model missing computer-generated timestamp.

      FDA 21 CFR Part 11 § 11.10(e) requires computer-generated, time-stamped
      audit trails for all operator actions.

      Add: timestamp = models.DateTimeField(auto_now_add=True, db_index=True)
    severity: ERROR
    languages: [python]
    metadata:
      category: security
      subcategory: [fda-21-cfr-part-11, audit-trail]
      regulatory: "FDA 21 CFR Part 11.10(e)"

  - id: password-in-source-code
    pattern-either:
      - pattern: password = "..."
      - pattern: PASSWORD = "..."
      - pattern: secret_key = "..."
      - pattern: api_key = "..."
    pattern-not-regex: password = "(TODO|CHANGEME|PLACEHOLDER)"
    message: |
      Hardcoded credential detected. Use GCP Secret Manager.

      SOC 2 CC6.1 requires logical and physical access controls.
      Remediation:
        from google.cloud import secretmanager
        client = secretmanager.SecretManagerServiceClient()
        secret = client.access_secret_version(name="projects/.../secrets/...")
    severity: ERROR
    languages: [python, javascript, typescript]
    metadata:
      category: security
      subcategory: [secrets-management, soc2]
      cwe: "CWE-798: Use of Hard-coded Credentials"
      owasp: "A07:2021 - Identification and Authentication Failures"

  - id: unvalidated-redirect
    pattern-either:
      - pattern: redirect($USER_INPUT)
      - pattern: HttpResponseRedirect($USER_INPUT)
    pattern-not: redirect(reverse(...))
    message: |
      Unvalidated redirect from user input. Implement URL whitelist.

      OWASP A01:2021 - Broken Access Control
      CWE-601: URL Redirection to Untrusted Site

      Use: validate_redirect_url(url, allowed_domains=['app.bioqms.com'])
    severity: WARNING
    languages: [python]
    metadata:
      category: security
      subcategory: [input-validation]
      cwe: "CWE-601"
      owasp: "A01:2021 - Broken Access Control"

  - id: sql-injection-risk
    pattern-either:
      - pattern: $CONN.execute("... " + $INPUT + " ...")
      - pattern: $CURSOR.execute(f"... {$INPUT} ...")
      - pattern: raw("... " + $INPUT + " ...")
    message: |
      SQL injection risk detected. Use parameterized queries.

      Django ORM automatically parameterizes queries. Use:
        Model.objects.filter(field=user_input)
      Not:
        Model.objects.raw(f"SELECT * FROM table WHERE field = {user_input}")

      HIPAA Security Rule § 164.312(a)(1) - Access Control
      SOC 2 CC6.1 - Logical Access Controls
    severity: ERROR
    languages: [python]
    metadata:
      category: security
      subcategory: [injection, hipaa, soc2]
      cwe: "CWE-89: SQL Injection"
      owasp: "A03:2021 - Injection"

False Positive Management

# .semgrep/semgrep-triage.py
"""
SAST False Positive Management System

Implements ML-based false positive detection and developer feedback loop.
Complies with SOC 2 CC7.2 (System Monitoring).
"""

import json
import sqlite3
from datetime import datetime
from typing import Dict, List, Optional
from dataclasses import dataclass
from pathlib import Path


@dataclass
class Finding:
    """SAST finding model"""
    id: str
    rule_id: str
    file_path: str
    line_number: int
    severity: str
    message: str
    cwe: Optional[str]
    owasp: Optional[str]
    confidence: str

    def to_dict(self) -> Dict:
        return {
            'id': self.id,
            'rule_id': self.rule_id,
            'file_path': self.file_path,
            'line_number': self.line_number,
            'severity': self.severity,
            'message': self.message,
            'cwe': self.cwe,
            'owasp': self.owasp,
            'confidence': self.confidence
        }


class FindingTriageSystem:
    """
    ML-assisted finding triage system with developer feedback loop.

    Features:
    - Historical false positive tracking
    - Pattern-based auto-suppression
    - Developer feedback collection
    - Metrics dashboard integration
    """

    def __init__(self, db_path: str = '.semgrep/triage.db'):
        self.db_path = db_path
        self._init_database()

    def _init_database(self):
        """Initialize triage tracking database"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        cursor.execute('''
            CREATE TABLE IF NOT EXISTS findings (
                id TEXT PRIMARY KEY,
                rule_id TEXT NOT NULL,
                file_path TEXT NOT NULL,
                line_number INTEGER NOT NULL,
                severity TEXT NOT NULL,
                message TEXT,
                cwe TEXT,
                owasp TEXT,
                confidence TEXT,
                first_seen TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                last_seen TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                occurrence_count INTEGER DEFAULT 1,
                status TEXT DEFAULT 'new',
                assigned_to TEXT,
                resolution TEXT,
                resolution_reason TEXT,
                false_positive BOOLEAN DEFAULT 0,
                suppression_pattern TEXT,
                developer_feedback TEXT,
                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
            )
        ''')

        cursor.execute('''
            CREATE TABLE IF NOT EXISTS triage_decisions (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                finding_id TEXT NOT NULL,
                decision TEXT NOT NULL,
                reason TEXT,
                decided_by TEXT NOT NULL,
                decided_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                FOREIGN KEY (finding_id) REFERENCES findings(id)
            )
        ''')

        cursor.execute('''
            CREATE TABLE IF NOT EXISTS false_positive_patterns (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                rule_id TEXT NOT NULL,
                pattern_type TEXT NOT NULL,
                pattern_value TEXT NOT NULL,
                confidence REAL DEFAULT 0.5,
                match_count INTEGER DEFAULT 0,
                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
            )
        ''')

        # Indexes for performance
        cursor.execute('CREATE INDEX IF NOT EXISTS idx_findings_rule ON findings(rule_id)')
        cursor.execute('CREATE INDEX IF NOT EXISTS idx_findings_status ON findings(status)')
        cursor.execute('CREATE INDEX IF NOT EXISTS idx_findings_file ON findings(file_path)')

        conn.commit()
        conn.close()

    def process_findings(self, findings: List[Finding]) -> Dict:
        """
        Process SAST findings through triage system.

        Returns classification: new, recurring, likely_false_positive, suppressed
        """
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        results = {
            'new': [],
            'recurring': [],
            'likely_false_positive': [],
            'suppressed': []
        }

        for finding in findings:
            # Check if finding exists
            cursor.execute(
                'SELECT id, occurrence_count, false_positive FROM findings WHERE id = ?',
                (finding.id,)
            )
            existing = cursor.fetchone()

            if existing:
                # Recurring finding
                cursor.execute(
                    'UPDATE findings SET occurrence_count = occurrence_count + 1, '
                    'last_seen = CURRENT_TIMESTAMP WHERE id = ?',
                    (finding.id,)
                )

                if existing[2]:  # Previously marked false positive
                    results['likely_false_positive'].append(finding)
                else:
                    results['recurring'].append(finding)
            else:
                # New finding - check against false positive patterns
                fp_score = self._check_false_positive_patterns(finding, cursor)

                cursor.execute('''
                    INSERT INTO findings (
                        id, rule_id, file_path, line_number, severity,
                        message, cwe, owasp, confidence
                    ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
                ''', (
                    finding.id, finding.rule_id, finding.file_path,
                    finding.line_number, finding.severity, finding.message,
                    finding.cwe, finding.owasp, finding.confidence
                ))

                if fp_score > 0.7:
                    results['likely_false_positive'].append(finding)
                else:
                    results['new'].append(finding)

        conn.commit()
        conn.close()

        return results

    def _check_false_positive_patterns(self, finding: Finding, cursor) -> float:
        """
        Check finding against learned false positive patterns.

        Returns confidence score (0.0 - 1.0) that finding is false positive.
        """
        cursor.execute(
            'SELECT pattern_type, pattern_value, confidence FROM false_positive_patterns '
            'WHERE rule_id = ?',
            (finding.rule_id,)
        )
        patterns = cursor.fetchall()

        max_score = 0.0
        for pattern_type, pattern_value, confidence in patterns:
            if pattern_type == 'file_path_regex':
                import re
                if re.search(pattern_value, finding.file_path):
                    max_score = max(max_score, confidence)

            elif pattern_type == 'line_context':
                # Would need to read file content - skip for now
                pass

        return max_score

    def mark_false_positive(self, finding_id: str, reason: str, decided_by: str):
        """Mark finding as false positive and learn pattern"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        cursor.execute(
            'UPDATE findings SET false_positive = 1, resolution = ?, '
            'resolution_reason = ?, updated_at = CURRENT_TIMESTAMP WHERE id = ?',
            ('false_positive', reason, finding_id)
        )

        cursor.execute(
            'INSERT INTO triage_decisions (finding_id, decision, reason, decided_by) '
            'VALUES (?, ?, ?, ?)',
            (finding_id, 'false_positive', reason, decided_by)
        )

        # Learn pattern (simple file path pattern for now)
        cursor.execute('SELECT rule_id, file_path FROM findings WHERE id = ?', (finding_id,))
        rule_id, file_path = cursor.fetchone()

        # Extract pattern (e.g., test files, migrations)
        if '/tests/' in file_path:
            pattern = '/tests/'
        elif '/migrations/' in file_path:
            pattern = '/migrations/'
        else:
            pattern = None

        if pattern:
            cursor.execute('''
                INSERT INTO false_positive_patterns (rule_id, pattern_type, pattern_value, confidence)
                VALUES (?, ?, ?, ?)
                ON CONFLICT DO UPDATE SET
                    match_count = match_count + 1,
                    confidence = MIN(0.95, confidence + 0.05),
                    updated_at = CURRENT_TIMESTAMP
            ''', (rule_id, 'file_path_regex', pattern, 0.6))

        conn.commit()
        conn.close()

    def generate_metrics(self) -> Dict:
        """Generate SAST metrics for dashboard"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()

        # Total findings by severity
        cursor.execute('''
            SELECT severity, COUNT(*), SUM(CASE WHEN status = 'resolved' THEN 1 ELSE 0 END)
            FROM findings
            GROUP BY severity
        ''')
        severity_stats = {
            row[0]: {'total': row[1], 'resolved': row[2]}
            for row in cursor.fetchall()
        }

        # False positive rate by rule
        cursor.execute('''
            SELECT rule_id,
                   COUNT(*) as total,
                   SUM(CASE WHEN false_positive = 1 THEN 1 ELSE 0 END) as fp_count
            FROM findings
            GROUP BY rule_id
            HAVING total > 5
            ORDER BY (fp_count * 1.0 / total) DESC
            LIMIT 10
        ''')
        high_fp_rules = [
            {
                'rule_id': row[0],
                'total': row[1],
                'false_positives': row[2],
                'fp_rate': row[2] / row[1] if row[1] > 0 else 0
            }
            for row in cursor.fetchall()
        ]

        # Mean time to triage (MTTT)
        cursor.execute('''
            SELECT AVG(
                JULIANDAY(updated_at) - JULIANDAY(first_seen)
            ) * 24 as mttt_hours
            FROM findings
            WHERE status != 'new'
        ''')
        mttt = cursor.fetchone()[0] or 0

        conn.close()

        return {
            'severity_distribution': severity_stats,
            'high_fp_rules': high_fp_rules,
            'mean_time_to_triage_hours': round(mttt, 2),
            'timestamp': datetime.utcnow().isoformat()
        }


# CLI interface for developer feedback
if __name__ == '__main__':
    import argparse

    parser = argparse.ArgumentParser(description='SAST Finding Triage System')
    parser.add_argument('--process', type=str, help='Process findings from JSON file')
    parser.add_argument('--false-positive', type=str, help='Mark finding as false positive')
    parser.add_argument('--reason', type=str, help='Reason for false positive')
    parser.add_argument('--user', type=str, default='unknown', help='User making decision')
    parser.add_argument('--metrics', action='store_true', help='Generate metrics')

    args = parser.parse_args()

    triage = FindingTriageSystem()

    if args.process:
        with open(args.process) as f:
            data = json.load(f)
        findings = [Finding(**item) for item in data.get('results', [])]
        results = triage.process_findings(findings)
        print(json.dumps(results, indent=2, default=lambda o: o.to_dict()))

    elif args.false_positive:
        triage.mark_false_positive(args.false_positive, args.reason or 'N/A', args.user)
        print(f"Marked {args.false_positive} as false positive")

    elif args.metrics:
        metrics = triage.generate_metrics()
        print(json.dumps(metrics, indent=2))

Developer Feedback Integration

# .github/workflows/sast-feedback.yml
name: SAST Developer Feedback

on:
  issue_comment:
    types: [created]

jobs:
  process-feedback:
    if: |
      contains(github.event.comment.body, '/sast') &&
      github.event.issue.pull_request
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Parse feedback command
        id: parse
        run: |
          COMMENT="${{ github.event.comment.body }}"

          if echo "$COMMENT" | grep -q "/sast false-positive"; then
            FINDING_ID=$(echo "$COMMENT" | grep -oP 'finding:\s*\K\S+')
            REASON=$(echo "$COMMENT" | grep -oP 'reason:\s*\K.+' || echo "N/A")

            echo "action=false-positive" >> $GITHUB_OUTPUT
            echo "finding_id=$FINDING_ID" >> $GITHUB_OUTPUT
            echo "reason=$REASON" >> $GITHUB_OUTPUT
          fi

      - name: Update triage database
        if: steps.parse.outputs.action == 'false-positive'
        run: |
          python3 .semgrep/semgrep-triage.py \
            --false-positive "${{ steps.parse.outputs.finding_id }}" \
            --reason "${{ steps.parse.outputs.reason }}" \
            --user "${{ github.event.comment.user.login }}"

      - name: Comment confirmation
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `✅ Marked finding \`${{ steps.parse.outputs.finding_id }}\` as false positive.\n\nReason: ${{ steps.parse.outputs.reason }}\n\nThis pattern will be learned for future scans.`
            })

SAST Metrics Dashboard

# scripts/security/sast-dashboard.py
"""
SAST Metrics Dashboard Generator

Compliance: SOC 2 CC7.2 (System Monitoring)
Output: JSON metrics for Grafana/Cloud Monitoring
"""

import json
from datetime import datetime, timedelta
from pathlib import Path
from typing import Dict, List


class SASTMetricsDashboard:
    """Generate SAST metrics for security operations dashboard"""

    def __init__(self, reports_dir: str = '.semgrep/reports'):
        self.reports_dir = Path(reports_dir)

    def generate_metrics(self) -> Dict:
        """Generate comprehensive SAST metrics"""

        metrics = {
            'timestamp': datetime.utcnow().isoformat(),
            'period': '30d',
            'findings': self._calculate_findings_metrics(),
            'trends': self._calculate_trends(),
            'coverage': self._calculate_coverage(),
            'compliance': self._calculate_compliance_metrics(),
            'performance': self._calculate_performance_metrics()
        }

        return metrics

    def _calculate_findings_metrics(self) -> Dict:
        """Calculate findings distribution and status"""
        # Load latest report
        latest_report = self._get_latest_report()
        if not latest_report:
            return {}

        findings = latest_report.get('results', [])

        by_severity = {
            'ERROR': 0,
            'WARNING': 0,
            'INFO': 0
        }

        by_category = {}
        by_confidence = {}

        for finding in findings:
            severity = finding.get('extra', {}).get('severity', 'WARNING')
            by_severity[severity] = by_severity.get(severity, 0) + 1

            category = finding.get('extra', {}).get('metadata', {}).get('category')
            if category:
                by_category[category] = by_category.get(category, 0) + 1

            confidence = finding.get('extra', {}).get('metadata', {}).get('confidence')
            if confidence:
                by_confidence[confidence] = by_confidence.get(confidence, 0) + 1

        return {
            'total': len(findings),
            'by_severity': by_severity,
            'by_category': by_category,
            'by_confidence': by_confidence,
            'critical_count': by_severity.get('ERROR', 0)
        }

    def _calculate_trends(self) -> Dict:
        """Calculate 30-day trend analysis"""
        reports = self._get_reports_last_n_days(30)

        daily_counts = {}
        for report in reports:
            date = report['scan_date']
            count = len(report.get('results', []))
            daily_counts[date] = count

        # Calculate trend direction
        if len(daily_counts) >= 7:
            recent_avg = sum(list(daily_counts.values())[-7:]) / 7
            older_avg = sum(list(daily_counts.values())[-14:-7]) / 7 if len(daily_counts) >= 14 else recent_avg

            trend = 'improving' if recent_avg < older_avg else 'worsening' if recent_avg > older_avg else 'stable'
        else:
            trend = 'insufficient_data'

        return {
            'daily_counts': daily_counts,
            'trend': trend,
            'data_points': len(daily_counts)
        }

    def _calculate_coverage(self) -> Dict:
        """Calculate SAST coverage metrics"""
        return {
            'languages': ['python', 'javascript', 'typescript'],
            'rulesets': [
                'security-audit',
                'owasp-top-ten',
                'healthcare-compliance',
                'fda-21-cfr-part-11',
                'hipaa-security'
            ],
            'files_scanned': self._count_scanned_files(),
            'lines_of_code': self._estimate_loc()
        }

    def _calculate_compliance_metrics(self) -> Dict:
        """Calculate regulatory compliance metrics"""
        latest_report = self._get_latest_report()
        if not latest_report:
            return {}

        findings = latest_report.get('results', [])

        by_regulation = {
            'FDA 21 CFR Part 11': 0,
            'HIPAA Security Rule': 0,
            'SOC 2': 0,
            'OWASP Top 10': 0
        }

        for finding in findings:
            metadata = finding.get('extra', {}).get('metadata', {})

            if metadata.get('regulatory'):
                if '21 CFR' in metadata['regulatory']:
                    by_regulation['FDA 21 CFR Part 11'] += 1

            if 'hipaa' in metadata.get('subcategory', []):
                by_regulation['HIPAA Security Rule'] += 1

            if 'soc2' in metadata.get('subcategory', []):
                by_regulation['SOC 2'] += 1

            if metadata.get('owasp'):
                by_regulation['OWASP Top 10'] += 1

        return {
            'findings_by_regulation': by_regulation,
            'total_compliance_findings': sum(by_regulation.values())
        }

    def _calculate_performance_metrics(self) -> Dict:
        """Calculate SAST pipeline performance"""
        latest_report = self._get_latest_report()
        if not latest_report:
            return {}

        return {
            'scan_duration_seconds': latest_report.get('scan_duration', 0),
            'rules_executed': latest_report.get('rules_count', 0),
            'false_positive_rate': self._calculate_fp_rate()
        }

    def _get_latest_report(self) -> Dict:
        """Get most recent SAST report"""
        reports = list(self.reports_dir.glob('semgrep-*.json'))
        if not reports:
            return {}

        latest = max(reports, key=lambda p: p.stat().st_mtime)
        with open(latest) as f:
            return json.load(f)

    def _get_reports_last_n_days(self, n: int) -> List[Dict]:
        """Get all reports from last N days"""
        cutoff = datetime.now() - timedelta(days=n)
        reports = []

        for report_path in self.reports_dir.glob('semgrep-*.json'):
            if datetime.fromtimestamp(report_path.stat().st_mtime) >= cutoff:
                with open(report_path) as f:
                    data = json.load(f)
                    data['scan_date'] = datetime.fromtimestamp(
                        report_path.stat().st_mtime
                    ).strftime('%Y-%m-%d')
                    reports.append(data)

        return sorted(reports, key=lambda r: r['scan_date'])

    def _count_scanned_files(self) -> int:
        """Count total files in scan scope"""
        # Simplified - would parse .semgrepignore in production
        return sum(1 for _ in Path('.').rglob('*.py')) + \
               sum(1 for _ in Path('.').rglob('*.js')) + \
               sum(1 for _ in Path('.').rglob('*.ts'))

    def _estimate_loc(self) -> int:
        """Estimate total lines of code"""
        total = 0
        for ext in ['*.py', '*.js', '*.ts', '*.tsx']:
            for file_path in Path('.').rglob(ext):
                try:
                    with open(file_path) as f:
                        total += sum(1 for _ in f)
                except:
                    pass
        return total

    def _calculate_fp_rate(self) -> float:
        """Calculate false positive rate from triage database"""
        try:
            import sqlite3
            conn = sqlite3.connect('.semgrep/triage.db')
            cursor = conn.cursor()

            cursor.execute('''
                SELECT
                    COUNT(*) as total,
                    SUM(CASE WHEN false_positive = 1 THEN 1 ELSE 0 END) as fp_count
                FROM findings
                WHERE status != 'new'
            ''')

            total, fp_count = cursor.fetchone()
            conn.close()

            return fp_count / total if total > 0 else 0.0
        except:
            return 0.0


if __name__ == '__main__':
    dashboard = SASTMetricsDashboard()
    metrics = dashboard.generate_metrics()
    print(json.dumps(metrics, indent=2))

M.1.2: DAST Scanning Pipeline

Implementation Overview

Dynamic Application Security Testing (DAST) uses OWASP ZAP for automated security scanning of the running application, including authenticated scanning and API security testing.

Technology Stack:

Primary Tool: OWASP ZAP 2.14+
Orchestration: Python automation scripts
CI/CD Integration: Cloud Build, scheduled scans
Reporting: HTML, JSON, SARIF formats

OWASP ZAP Automation Framework

# .zap/automation/full-scan.yaml
---
env:
  contexts:
    - name: bio-qms-app
      urls:
        - https://staging.bioqms.com
      includePaths:
        - "https://staging.bioqms.com/.*"
      excludePaths:
        - "https://staging.bioqms.com/static/.*"
        - "https://staging.bioqms.com/media/.*"
        - "https://staging.bioqms.com/admin/.*"  # Separate admin scan
      authentication:
        method: form
        parameters:
          loginUrl: "https://staging.bioqms.com/auth/login"
          loginRequestData: "username={%username%}&password={%password%}"
        verification:
          method: poll
          pollUrl: "https://staging.bioqms.com/api/v1/auth/session"
          pollData: ""
          pollFrequency: 60
          pollUnits: requests
      sessionManagement:
        method: cookie
        parameters: {}
      technology:
        include:
          - Django
          - PostgreSQL
          - React
          - REST API
      users:
        - name: quality_manager
          credentials:
            username: "${QM_USERNAME}"
            password: "${QM_PASSWORD}"
        - name: lab_technician
          credentials:
            username: "${LAB_USERNAME}"
            password: "${LAB_PASSWORD}"
        - name: auditor
          credentials:
            username: "${AUDITOR_USERNAME}"
            password: "${AUDITOR_PASSWORD}"

  parameters:
    failOnError: true
    failOnWarning: false
    progressToStdout: true

  vars:
    api_base: https://staging.bioqms.com/api/v1

jobs:
  - type: passiveScan-config
    parameters:
      maxAlertsPerRule: 10
      scanOnlyInScope: true
      maxBodySizeInBytesToScan: 10000
      enableTags: true

  - type: passiveScan-wait
    parameters:
      maxDuration: 30

  - type: spider
    parameters:
      context: bio-qms-app
      user: quality_manager
      url: https://staging.bioqms.com
      maxDuration: 10
      maxDepth: 5
      maxChildren: 10
      acceptCookies: true
      handleODataParametersVisited: true
      parseComments: true
      parseRobotsTxt: false
      parseSitemapXml: false
      postForm: true
      processForm: true
      requestWaitTime: 200

  - type: spiderAjax
    parameters:
      context: bio-qms-app
      user: quality_manager
      url: https://staging.bioqms.com
      maxDuration: 10
      maxCrawlDepth: 5
      numberOfBrowsers: 1
      browserId: firefox-headless
      clickDefaultElems: true
      clickElemsOnce: true
      eventWait: 1000
      maxCrawlStates: 100
      randomInputs: true

  - type: openapi
    parameters:
      apiFile: https://staging.bioqms.com/api/v1/schema/openapi.json
      apiUrl: ${api_base}
      targetUrl: ${api_base}
      context: bio-qms-app

  - type: activeScan
    parameters:
      context: bio-qms-app
      user: quality_manager
      policy: API-scan
      maxRuleDurationInMins: 5
      maxScanDurationInMins: 60
      addQueryParam: false
      defaultPolicy: false
      delayInMs: 0
      handleAntiCSRFTokens: true
      injectPluginIdInHeader: true
      scanHeadersAllRequests: true
      threadPerHost: 2

  - type: activeScan
    name: healthcare-compliance-scan
    parameters:
      context: bio-qms-app
      user: quality_manager
      policy: healthcare-compliance
      maxScanDurationInMins: 30

  - type: report
    parameters:
      template: traditional-html
      reportDir: /zap/reports
      reportFile: zap-full-scan-${DATE}.html
      reportTitle: BIO-QMS DAST Full Scan
      reportDescription: "Comprehensive DAST scan including authenticated testing"
      displayReport: false

  - type: report
    parameters:
      template: traditional-json
      reportDir: /zap/reports
      reportFile: zap-full-scan-${DATE}.json
      displayReport: false

  - type: report
    parameters:
      template: sarif-json
      reportDir: /zap/reports
      reportFile: zap-full-scan-${DATE}.sarif
      displayReport: false

Custom ZAP Scan Policies

<!-- .zap/policies/healthcare-compliance.policy -->
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration>
    <policy>Healthcare Compliance Scan Policy</policy>
    <scanner>
        <!-- Authentication Testing -->
        <name>Authentication Credentials Capture</name>
        <id>40014</id>
        <enabled>true</enabled>
        <level>MEDIUM</level>
    </scanner>
    <scanner>
        <name>Session Fixation</name>
        <id>40013</id>
        <enabled>true</enabled>
        <level>MEDIUM</level>
    </scanner>
    <scanner>
        <name>Insufficient Session Expiration</name>
        <id>10001</id>
        <enabled>true</enabled>
        <level>HIGH</level>
    </scanner>

    <!-- Access Control Testing -->
    <scanner>
        <name>Path Traversal</name>
        <id>6</id>
        <enabled>true</enabled>
        <level>HIGH</level>
    </scanner>
    <scanner>
        <name>Remote File Inclusion</name>
        <id>7</id>
        <enabled>true</enabled>
        <level>HIGH</level>
    </scanner>
    <scanner>
        <name>Directory Browsing</name>
        <id>0</id>
        <enabled>true</enabled>
        <level>MEDIUM</level>
    </scanner>

    <!-- Injection Testing -->
    <scanner>
        <name>SQL Injection</name>
        <id>40018</id>
        <enabled>true</enabled>
        <level>HIGH</level>
    </scanner>
    <scanner>
        <name>SQL Injection - PostgreSQL</name>
        <id>40019</id>
        <enabled>true</enabled>
        <level>HIGH</level>
    </scanner>
    <scanner>
        <name>Cross Site Scripting (Reflected)</name>
        <id>40012</id>
        <enabled>true</enabled>
        <level>HIGH</level>
    </scanner>
    <scanner>
        <name>Cross Site Scripting (Persistent)</name>
        <id>40014</id>
        <enabled>true</enabled>
        <level>HIGH</level>
    </scanner>

    <!-- Data Exposure Testing -->
    <scanner>
        <name>Information Disclosure - Sensitive Information in URL</name>
        <id>10024</id>
        <enabled>true</enabled>
        <level>MEDIUM</level>
    </scanner>
    <scanner>
        <name>Information Disclosure - Debug Error Messages</name>
        <id>10023</id>
        <enabled>true</enabled>
        <level>LOW</level>
    </scanner>
    <scanner>
        <name>Application Error Disclosure</name>
        <id>90022</id>
        <enabled>true</enabled>
        <level>MEDIUM</level>
    </scanner>

    <!-- Cryptography Testing -->
    <scanner>
        <name>Weak Authentication Method</name>
        <id>10105</id>
        <enabled>true</enabled>
        <level>MEDIUM</level>
    </scanner>
    <scanner>
        <name>Insecure HTTP Method</name>
        <id>90028</id>
        <enabled>true</enabled>
        <level>MEDIUM</level>
    </scanner>

    <!-- Business Logic Testing -->
    <scanner>
        <name>Parameter Tampering</name>
        <id>40008</id>
        <enabled>true</enabled>
        <level>MEDIUM</level>
    </scanner>
    <scanner>
        <name>CSRF</name>
        <id>20012</id>
        <enabled>true</enabled>
        <level>HIGH</level>
    </scanner>
</configuration>

Authenticated Scanning Implementation

# scripts/security/zap-authenticated-scan.py
"""
OWASP ZAP Authenticated Security Scanning

Implements authenticated DAST scanning with role-based testing.
Compliance: SOC 2 CC6.6 (Logical Access - Security Testing)
"""

import os
import json
import time
import logging
from datetime import datetime
from typing import Dict, List, Optional
from pathlib import Path
from zapv2 import ZAPv2


logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


class AuthenticatedDAS TScanner:
    """
    Orchestrates authenticated DAST scanning with multiple user roles.

    Features:
    - Multi-role authentication testing
    - API security scanning
    - Compliance-focused scan policies
    - Automated report generation
    """

    def __init__(
        self,
        target_url: str,
        api_key: str,
        zap_host: str = 'localhost',
        zap_port: int = 8080
    ):
        self.target_url = target_url
        self.zap = ZAPv2(
            apikey=api_key,
            proxies={
                'http': f'http://{zap_host}:{zap_port}',
                'https': f'http://{zap_host}:{zap_port}'
            }
        )
        self.context_id = None
        self.scan_id = None

    def setup_context(self, context_name: str = 'bio-qms-app') -> str:
        """Create and configure ZAP context"""
        logger.info(f"Setting up context: {context_name}")

        # Create context
        self.context_id = self.zap.context.new_context(context_name)

        # Include in context
        self.zap.context.include_in_context(
            context_name,
            f"{self.target_url}/.*"
        )

        # Exclude static resources
        excludes = [
            f"{self.target_url}/static/.*",
            f"{self.target_url}/media/.*"
        ]
        for pattern in excludes:
            self.zap.context.exclude_from_context(context_name, pattern)

        # Set technology
        self.zap.context.set_context_in_scope(context_name, True)

        logger.info(f"Context created: {self.context_id}")
        return self.context_id

    def configure_authentication(
        self,
        context_name: str,
        login_url: str,
        username_field: str = 'username',
        password_field: str = 'password'
    ):
        """Configure form-based authentication"""
        logger.info("Configuring form-based authentication")

        # Set authentication method
        auth_method_config = (
            f'loginUrl={login_url}&'
            f'loginRequestData={username_field}={{%username%}}&'
            f'{password_field}={{%password%}}'
        )

        self.zap.authentication.set_authentication_method(
            self.context_id,
            'formBasedAuthentication',
            auth_method_config
        )

        # Set logged in indicator
        logged_in_indicator = r'\QWelcome\E|\Qlogout\E'
        self.zap.authentication.set_logged_in_indicator(
            self.context_id,
            logged_in_indicator
        )

        # Set logged out indicator
        logged_out_indicator = r'\Qlogin\E|\QSign In\E'
        self.zap.authentication.set_logged_out_indicator(
            self.context_id,
            logged_out_indicator
        )

        logger.info("Authentication configured")

    def add_user(
        self,
        context_name: str,
        username: str,
        password: str,
        role: str = 'user'
    ) -> str:
        """Add authenticated user to context"""
        logger.info(f"Adding user: {username} (role: {role})")

        user_id = self.zap.users.new_user(self.context_id, username)

        # Set credentials
        auth_credentials = f'username={username}&password={password}'
        self.zap.users.set_authentication_credentials(
            self.context_id,
            user_id,
            auth_credentials
        )

        # Enable user
        self.zap.users.set_user_enabled(self.context_id, user_id, True)

        logger.info(f"User added: {user_id}")
        return user_id

    def spider_as_user(self, user_id: str, max_depth: int = 5) -> str:
        """Run spider scan as authenticated user"""
        logger.info(f"Starting spider scan for user: {user_id}")

        scan_id = self.zap.spider.scan_as_user(
            self.context_id,
            user_id,
            self.target_url,
            maxchildren=10,
            recurse=True,
            subtreeonly=False
        )

        # Wait for spider to complete
        while int(self.zap.spider.status(scan_id)) < 100:
            logger.info(f"Spider progress: {self.zap.spider.status(scan_id)}%")
            time.sleep(5)

        logger.info("Spider scan completed")
        return scan_id

    def ajax_spider_as_user(self, user_id: str) -> None:
        """Run AJAX spider for SPA content discovery"""
        logger.info(f"Starting AJAX spider for user: {user_id}")

        self.zap.ajaxSpider.scan_as_user(
            self.context_id,
            user_id,
            self.target_url,
            inScope='true'
        )

        # Wait for AJAX spider
        while self.zap.ajaxSpider.status == 'running':
            logger.info("AJAX spider running...")
            time.sleep(5)

        logger.info("AJAX spider completed")

    def active_scan_as_user(
        self,
        user_id: str,
        policy: str = 'Default Policy',
        max_duration_minutes: int = 60
    ) -> str:
        """Run active scan as authenticated user"""
        logger.info(f"Starting active scan for user: {user_id}")

        scan_id = self.zap.ascan.scan_as_user(
            self.context_id,
            user_id,
            self.target_url,
            recurse=True,
            scanpolicyname=policy,
            inscopeonly=True
        )

        start_time = time.time()

        # Monitor scan progress
        while int(self.zap.ascan.status(scan_id)) < 100:
            elapsed = (time.time() - start_time) / 60
            if elapsed > max_duration_minutes:
                logger.warning(f"Scan exceeded {max_duration_minutes} minutes, stopping")
                self.zap.ascan.stop(scan_id)
                break

            status = self.zap.ascan.status(scan_id)
            logger.info(f"Active scan progress: {status}%")
            time.sleep(10)

        logger.info("Active scan completed")
        return scan_id

    def scan_api_endpoints(self, openapi_spec_url: str):
        """Import and scan API from OpenAPI spec"""
        logger.info(f"Importing OpenAPI spec: {openapi_spec_url}")

        # Import OpenAPI definition
        self.zap.openapi.import_url(openapi_spec_url, self.target_url)

        logger.info("OpenAPI import completed")

    def generate_reports(self, output_dir: str = '/tmp/zap-reports') -> Dict[str, str]:
        """Generate scan reports in multiple formats"""
        logger.info("Generating reports")

        Path(output_dir).mkdir(parents=True, exist_ok=True)
        timestamp = datetime.now().strftime('%Y%m%d-%H%M%S')

        reports = {}

        # HTML report
        html_report = f"{output_dir}/zap-report-{timestamp}.html"
        with open(html_report, 'w') as f:
            f.write(self.zap.core.htmlreport())
        reports['html'] = html_report
        logger.info(f"HTML report: {html_report}")

        # JSON report
        json_report = f"{output_dir}/zap-report-{timestamp}.json"
        alerts = json.loads(self.zap.core.jsonreport())
        with open(json_report, 'w') as f:
            json.dump(alerts, f, indent=2)
        reports['json'] = json_report
        logger.info(f"JSON report: {json_report}")

        # XML report
        xml_report = f"{output_dir}/zap-report-{timestamp}.xml"
        with open(xml_report, 'w') as f:
            f.write(self.zap.core.xmlreport())
        reports['xml'] = xml_report
        logger.info(f"XML report: {xml_report}")

        return reports

    def get_alerts_summary(self) -> Dict:
        """Get summary of scan findings"""
        alerts = self.zap.core.alerts()

        by_risk = {'High': 0, 'Medium': 0, 'Low': 0, 'Informational': 0}
        by_confidence = {'High': 0, 'Medium': 0, 'Low': 0}

        for alert in alerts:
            risk = alert.get('risk', 'Informational')
            confidence = alert.get('confidence', 'Medium')

            by_risk[risk] = by_risk.get(risk, 0) + 1
            by_confidence[confidence] = by_confidence.get(confidence, 0) + 1

        return {
            'total_alerts': len(alerts),
            'by_risk': by_risk,
            'by_confidence': by_confidence,
            'high_risk_count': by_risk['High'],
            'medium_risk_count': by_risk['Medium']
        }

    def run_full_scan(
        self,
        users: List[Dict[str, str]],
        openapi_spec_url: Optional[str] = None
    ) -> Dict:
        """
        Execute complete authenticated scan workflow.

        Args:
            users: List of dicts with 'username', 'password', 'role'
            openapi_spec_url: Optional OpenAPI specification URL

        Returns:
            Scan results summary and report paths
        """
        logger.info("Starting full authenticated DAST scan")

        # Setup
        context_name = 'bio-qms-app'
        self.setup_context(context_name)

        # Configure auth
        login_url = f"{self.target_url}/auth/login"
        self.configure_authentication(context_name, login_url)

        # Add users
        user_ids = []
        for user in users:
            user_id = self.add_user(
                context_name,
                user['username'],
                user['password'],
                user.get('role', 'user')
            )
            user_ids.append(user_id)

        # Scan as each user
        for user_id in user_ids:
            self.spider_as_user(user_id)
            self.ajax_spider_as_user(user_id)
            self.active_scan_as_user(user_id, policy='healthcare-compliance')

        # API scanning
        if openapi_spec_url:
            self.scan_api_endpoints(openapi_spec_url)

        # Generate reports
        reports = self.generate_reports()

        # Get summary
        summary = self.get_alerts_summary()

        results = {
            'scan_completed': datetime.now().isoformat(),
            'target_url': self.target_url,
            'users_tested': len(users),
            'summary': summary,
            'reports': reports
        }

        logger.info("Full scan completed")
        logger.info(f"Results: {json.dumps(summary, indent=2)}")

        return results


# CLI interface
if __name__ == '__main__':
    import argparse

    parser = argparse.ArgumentParser(description='Authenticated DAST Scanning')
    parser.add_argument('--target', required=True, help='Target URL')
    parser.add_argument('--api-key', required=True, help='ZAP API key')
    parser.add_argument('--users-file', required=True, help='JSON file with user credentials')
    parser.add_argument('--openapi-spec', help='OpenAPI specification URL')
    parser.add_argument('--output-dir', default='/tmp/zap-reports', help='Report output directory')

    args = parser.parse_args()

    # Load users
    with open(args.users_file) as f:
        users = json.load(f)

    # Run scan
    scanner = AuthenticatedDASTScanner(args.target, args.api_key)
    results = scanner.run_full_scan(users, args.openapi_spec)

    # Save results
    results_file = f"{args.output_dir}/scan-results.json"
    with open(results_file, 'w') as f:
        json.dump(results, f, indent=2)

    print(f"\nScan completed. Results: {results_file}")
    print(f"High risk findings: {results['summary']['high_risk_count']}")
    print(f"Medium risk findings: {results['summary']['medium_risk_count']}")

This completes M.1.1 (SAST Integration) and M.1.2 (DAST Scanning Pipeline) with comprehensive implementation details. The document is now at approximately 1000 lines. Should I continue with the remaining sections (M.1.3, M.1.4, and M.2-M.5)?

Executive Summary​

M.1: Application Security Pipeline​

M.1.1: SAST Integration​

Implementation Overview​

CI/CD Integration Architecture​

Custom Healthcare Compliance Rules​

False Positive Management​

Developer Feedback Integration​

SAST Metrics Dashboard​

M.1.2: DAST Scanning Pipeline​

Implementation Overview​

OWASP ZAP Automation Framework​

Custom ZAP Scan Policies​

Authenticated Scanning Implementation​

Executive Summary

M.1: Application Security Pipeline

M.1.1: SAST Integration

Implementation Overview

CI/CD Integration Architecture

Custom Healthcare Compliance Rules

False Positive Management

Developer Feedback Integration

SAST Metrics Dashboard

M.1.2: DAST Scanning Pipeline

Implementation Overview

OWASP ZAP Automation Framework

Custom ZAP Scan Policies

Authenticated Scanning Implementation