Security Hardening Plan - Coditect V5
Date: 2025-10-07 Status: 📋 Planning Phase Target: Production-Ready Security Posture Timeline: 3 Phases (Immediate, Short-term, Long-term)
🎯 Executive Summary
This document outlines a comprehensive security hardening plan for the Coditect V5 platform deployed on GKE. The plan addresses current security gaps and establishes a roadmap to achieve production-grade security compliance (SOC2, GDPR, HIPAA-ready).
Current Security Posture: 🟡 Development (Not production-ready)
Target Security Posture: 🟢 Production (SOC2 Type II ready)
Estimated Effort: 40-60 hours across 3 phases
📊 Current Security Assessment
✅ What's Already Secure
| Category | Status | Evidence |
|---|---|---|
| Network Isolation | ✅ Good | VPC-native GKE, private pod networking |
| TLS in Transit | ✅ Good | Google-managed SSL cert for coditect.ai |
| RBAC | ✅ Basic | GKE default RBAC enabled |
| Audit Logging | ✅ Enabled | Cloud Audit Logs active |
| JWT Authentication | ✅ Implemented | Token-based API auth with HS256 |
| Multi-tenancy | ✅ Architected | Self-tenant pattern in FDB |
🟡 What Needs Improvement
| Category | Risk Level | Issue | Impact |
|---|---|---|---|
| Secret Management | 🔴 High | JWT secret in K8s Secret (base64) | Secret exposure if cluster compromised |
| Static IPs | 🟡 Medium | Ephemeral LoadBalancer IPs | Service disruption on recreate |
| Network Policies | 🔴 High | No NetworkPolicies defined | Unrestricted pod-to-pod traffic |
| Pod Security | 🟡 Medium | No PodSecurityPolicy/Standards | Pods can run as root |
| Image Security | 🟡 Medium | No image scanning | Vulnerable dependencies |
| Firewall Rules | 🟡 Medium | SSH open to 0.0.0.0/0 | Brute-force risk |
| Private Cluster | 🟡 Medium | Public control plane | Increased attack surface |
| Secrets Rotation | 🔴 High | No automatic rotation | Stale credentials |
| Binary Authorization | 🟠 Low | Not enabled | Unsigned images can deploy |
| WAF | 🟠 Low | No Cloud Armor | No DDoS/injection protection |
🛡️ Hardening Roadmap
Phase 1: Critical Security (Week 1) - Immediate
Goal: Address high-risk vulnerabilities that could lead to immediate compromise.
Estimated Time: 16-20 hours
1.1 Secret Management Migration
Current: JWT secret stored in Kubernetes Secret (base64 encoded, not encrypted at rest by default)
Target: Google Secret Manager with Workload Identity
Implementation:
# 1. Create secret in Secret Manager
gcloud secrets create jwt-secret \
--replication-policy="automatic" \
--project=serene-voltage-464305-n2
# 2. Add secret value
echo -n "YOUR_SECURE_JWT_SECRET_256_BITS" | \
gcloud secrets versions add jwt-secret --data-file=-
# 3. Grant Workload Identity access
gcloud secrets add-iam-policy-binding jwt-secret \
--member="serviceAccount:coditect-api-v5@serene-voltage-464305-n2.iam.gserviceaccount.com" \
--role="roles/secretmanager.secretAccessor"
# 4. Update Terraform to use Secret Manager
Terraform Example:
# modules/api-deployment/main.tf
data "google_secret_manager_secret_version" "jwt_secret" {
secret = "jwt-secret"
}
resource "kubernetes_secret" "jwt_secret" {
metadata {
name = "jwt-secret-k8s"
namespace = var.namespace
}
data = {
secret = data.google_secret_manager_secret_version.jwt_secret.secret_data
}
}
Validation:
# Verify pod can access secret
kubectl exec -n coditect-app <pod-name> -- env | grep JWT_SECRET
Timeline: 2-3 hours
1.2 Network Policies
Current: All pods can communicate with all other pods (default allow-all)
Target: Zero-trust network segmentation
Implementation:
Policy 1: Deny All by Default
# network-policy-deny-all.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-ingress
namespace: coditect-app
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Policy 2: Allow API → FoundationDB
# network-policy-api-to-fdb.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-to-fdb
namespace: coditect-app
spec:
podSelector:
matchLabels:
app: foundationdb
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: coditect-api-v5
ports:
- protocol: TCP
port: 4500
Policy 3: Allow LoadBalancer → API
# network-policy-lb-to-api.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-lb-to-api
namespace: coditect-app
spec:
podSelector:
matchLabels:
app: coditect-api-v5
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 8080
Policy 4: Allow API Egress (DNS + FDB + External)
# network-policy-api-egress.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-egress
namespace: coditect-app
spec:
podSelector:
matchLabels:
app: coditect-api-v5
policyTypes:
- Egress
egress:
# DNS
- to:
- namespaceSelector:
matchLabels:
name: kube-system
- podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
# FoundationDB
- to:
- podSelector:
matchLabels:
app: foundationdb
ports:
- protocol: TCP
port: 4500
# External (for future integrations)
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443
Apply Policies:
kubectl apply -f network-policy-deny-all.yaml
kubectl apply -f network-policy-api-to-fdb.yaml
kubectl apply -f network-policy-lb-to-api.yaml
kubectl apply -f network-policy-api-egress.yaml
Validation:
# Test API can reach FDB
kubectl exec -n coditect-app <api-pod> -- nc -zv fdb-cluster.coditect-app.svc.cluster.local 4500
# Test API cannot reach unrelated pods (should fail)
kubectl exec -n coditect-app <api-pod> -- nc -zv <other-pod-ip> 80
Timeline: 4-6 hours (includes testing)
1.3 Firewall Rule Hardening
Current: SSH allowed from 0.0.0.0/0
Target: Restrict SSH to known IPs only
Implementation:
# 1. Identify current SSH firewall rule
gcloud compute firewall-rules list --filter="name~ssh" --project=serene-voltage-464305-n2
# 2. Update to restrict source IPs
gcloud compute firewall-rules update allow-ssh \
--source-ranges="YOUR_OFFICE_IP/32,YOUR_VPN_IP/32" \
--project=serene-voltage-464305-n2
# 3. Create bastion host for emergency access (optional)
# See Section 2.5 for bastion setup
Terraform Example:
# modules/networking/main.tf
resource "google_compute_firewall" "allow_ssh" {
name = "allow-ssh-restricted"
network = google_compute_network.vpc.name
project = var.project_id
allow {
protocol = "tcp"
ports = ["22"]
}
source_ranges = var.allowed_ssh_ips # ["1.2.3.4/32", "5.6.7.8/32"]
target_tags = ["allow-ssh"]
}
Timeline: 1 hour
1.4 Pod Security Standards
Current: Pods can run as root, with privileged escalation
Target: Enforce restricted Pod Security Standards
Implementation:
Enable Pod Security Admission (K8s 1.25+):
# namespace-pod-security.yaml
apiVersion: v1
kind: Namespace
metadata:
name: coditect-app
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Update API Deployment to Comply:
# modules/api-deployment/main.tf (Kubernetes manifest)
apiVersion: apps/v1
kind: Deployment
metadata:
name: coditect-api-v5
spec:
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: api
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true # May require volumeMounts for tmp
Apply:
kubectl label namespace coditect-app pod-security.kubernetes.io/enforce=restricted
kubectl apply -f updated-deployment.yaml
Validation:
# Verify pod is not running as root
kubectl exec -n coditect-app <pod-name> -- id
# Output: uid=1000 gid=1000 (NOT uid=0)
Timeline: 3-4 hours (includes fixing deployment spec)
1.5 Static IP Reservation
Current: Ephemeral LoadBalancer IPs (change on recreate)
Target: Static reserved IPs for all LoadBalancers
Implementation:
# 1. Reserve IPs
gcloud compute addresses create api-lb-static-ip \
--region=us-central1 \
--project=serene-voltage-464305-n2
gcloud compute addresses create workspace-lb-static-ip \
--region=us-central1 \
--project=serene-voltage-464305-n2
# 2. Get IP addresses
API_STATIC_IP=$(gcloud compute addresses describe api-lb-static-ip --region=us-central1 --format="get(address)")
WORKSPACE_STATIC_IP=$(gcloud compute addresses describe workspace-lb-static-ip --region=us-central1 --format="get(address)")
# 3. Update LoadBalancer services
kubectl patch svc api-loadbalancer -n coditect-app -p "{\"spec\":{\"loadBalancerIP\":\"$API_STATIC_IP\"}}"
kubectl patch svc codi-workspace-lb -n codi-workspaces -p "{\"spec\":{\"loadBalancerIP\":\"$WORKSPACE_STATIC_IP\"}}"
# 4. Update DNS
# Point coditect.ai A record to $API_STATIC_IP
Terraform Example:
# modules/api-deployment/main.tf
resource "google_compute_address" "api_lb_ip" {
name = "api-lb-static-ip"
region = var.region
}
resource "kubernetes_service" "api_loadbalancer" {
metadata {
name = "api-loadbalancer"
namespace = var.namespace
}
spec {
type = "LoadBalancer"
load_balancer_ip = google_compute_address.api_lb_ip.address
# ...
}
}
Timeline: 1-2 hours
Phase 2: Enhanced Security (Week 2-3) - Short-term
Goal: Implement defense-in-depth and compliance requirements.
Estimated Time: 16-24 hours
2.1 Image Vulnerability Scanning
Current: No automated scanning of container images
Target: Continuous scanning with blocking on critical CVEs
Implementation:
Enable Container Scanning:
# 1. Enable Artifact Registry vulnerability scanning
gcloud services enable containerscanning.googleapis.com \
--project=serene-voltage-464305-n2
# 2. Configure scanning
gcloud artifacts repositories add-iam-policy-binding coditect-registry \
--location=us-central1 \
--member=serviceAccount:service-<PROJECT_NUMBER>@containerscanning.iam.gserviceaccount.com \
--role=roles/containerscanning.serviceAgent
Cloud Build Integration:
# cloudbuild.yaml
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/coditect-v5-api:$SHORT_SHA', '.']
# Scan image
- name: 'gcr.io/cloud-builders/gcloud'
args:
- 'container'
- 'images'
- 'scan'
- 'gcr.io/$PROJECT_ID/coditect-v5-api:$SHORT_SHA'
# Check scan results (fail build on critical)
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bash'
args:
- '-c'
- |
CRITICAL=$(gcloud container images describe gcr.io/$PROJECT_ID/coditect-v5-api:$SHORT_SHA \
--format='value(discovery.analysisStatus.criticalCount)')
if [[ "$CRITICAL" -gt 0 ]]; then
echo "❌ Found $CRITICAL critical vulnerabilities"
exit 1
fi
Timeline: 3-4 hours
2.2 Binary Authorization
Current: Any image can be deployed to cluster
Target: Only signed/verified images can deploy
Implementation:
Enable Binary Authorization:
# 1. Enable API
gcloud services enable binaryauthorization.googleapis.com \
--project=serene-voltage-464305-n2
# 2. Create attestor
gcloud container binauthz attestors create prod-attestor \
--attestation-authority-note=prod-note \
--attestation-authority-note-project=serene-voltage-464305-n2
# 3. Create policy
cat > policy.yaml << EOF
admissionWhitelistPatterns:
- namePattern: gcr.io/serene-voltage-464305-n2/*
defaultAdmissionRule:
requireAttestationsBy:
- projects/serene-voltage-464305-n2/attestors/prod-attestor
enforcementMode: ENFORCED_BLOCK_AND_AUDIT_LOG
EOF
gcloud container binauthz policy import policy.yaml
# 4. Enable on GKE cluster
gcloud container clusters update codi-poc-e2-cluster \
--enable-binauthz \
--zone=us-central1-a
Cloud Build Signing:
# cloudbuild.yaml (after successful build)
- name: 'gcr.io/cloud-builders/gcloud'
args:
- 'beta'
- 'container'
- 'binauthz'
- 'attestations'
- 'sign-and-create'
- '--artifact-url=gcr.io/$PROJECT_ID/coditect-v5-api:$SHORT_SHA'
- '--attestor=projects/$PROJECT_ID/attestors/prod-attestor'
- '--attestor-project=$PROJECT_ID'
Timeline: 4-5 hours
2.3 Private GKE Cluster
Current: Public control plane endpoint
Target: Private control plane with authorized networks
Implementation:
⚠️ WARNING: This requires cluster recreation. Plan downtime or blue-green migration.
Option 1: Terraform Module Update (Recommended):
# modules/gke-cluster/main.tf
resource "google_container_cluster" "primary" {
name = var.cluster_name
location = var.zone
# Enable private cluster
private_cluster_config {
enable_private_nodes = true
enable_private_endpoint = false # Set true for full private (requires VPN/bastion)
master_ipv4_cidr_block = "172.16.0.0/28"
}
# Authorized networks (who can access control plane)
master_authorized_networks_config {
cidr_blocks {
cidr_block = "YOUR_OFFICE_IP/32"
display_name = "Office"
}
cidr_blocks {
cidr_block = "YOUR_VPN_IP/32"
display_name = "VPN"
}
}
# IP allocation for VPC-native cluster
ip_allocation_policy {
cluster_secondary_range_name = "pod-range"
services_secondary_range_name = "service-range"
}
}
Option 2: Manual Migration (if not using Terraform):
# 1. Create new private cluster
gcloud container clusters create codi-poc-e2-cluster-private \
--enable-private-nodes \
--enable-private-endpoint \
--master-ipv4-cidr 172.16.0.0/28 \
--enable-ip-alias \
--zone=us-central1-a
# 2. Migrate workloads (see Section 3.5 for blue-green)
# 3. Delete old cluster
gcloud container clusters delete codi-poc-e2-cluster
Timeline: 6-8 hours (includes migration)
2.4 Secrets Rotation Policy
Current: No automated rotation
Target: 90-day rotation for all secrets
Implementation:
Automated Rotation with Secret Manager:
# 1. Enable Secret Manager rotation
gcloud secrets add-rotation-schedule jwt-secret \
--rotation-period=90d \
--next-rotation-time=$(date -d '+90 days' --iso-8601=seconds)
# 2. Create Cloud Function for rotation
# (Generates new JWT secret, updates Secret Manager, triggers pod restart)
Manual Rotation Runbook (until automation ready):
#!/bin/bash
# rotate-jwt-secret.sh
# 1. Generate new secret
NEW_SECRET=$(openssl rand -base64 32)
# 2. Add to Secret Manager
echo -n "$NEW_SECRET" | gcloud secrets versions add jwt-secret --data-file=-
# 3. Restart API pods to pick up new secret
kubectl rollout restart deployment/coditect-api-v5 -n coditect-app
# 4. Verify health
kubectl rollout status deployment/coditect-api-v5 -n coditect-app
curl http://34.46.212.40/api/v5/health
# 5. Document rotation in audit log
echo "$(date): JWT secret rotated" >> /var/log/secret-rotation.log
Timeline: 3-4 hours (manual); 8-10 hours (automation)
2.5 Bastion Host for Emergency Access
Current: Direct SSH to nodes (disabled in prod)
Target: Bastion host with audit logging
Implementation:
Create Bastion VM:
gcloud compute instances create coditect-bastion \
--zone=us-central1-a \
--machine-type=e2-micro \
--subnet=default \
--no-address \
--metadata=enable-oslogin=TRUE \
--scopes=cloud-platform \
--tags=bastion
Firewall Rules:
# Allow SSH to bastion only
gcloud compute firewall-rules create allow-ssh-to-bastion \
--network=default \
--allow=tcp:22 \
--source-ranges=YOUR_OFFICE_IP/32 \
--target-tags=bastion
# Allow bastion to reach cluster nodes
gcloud compute firewall-rules create allow-bastion-to-nodes \
--network=default \
--allow=tcp:22 \
--source-tags=bastion \
--target-tags=gke-node
Access via IAP (Identity-Aware Proxy):
# Enable IAP
gcloud compute start-iap-tunnel coditect-bastion 22 \
--local-host-port=localhost:2222 \
--zone=us-central1-a
# SSH through tunnel
ssh -p 2222 localhost
Timeline: 2-3 hours
Phase 3: Compliance & Monitoring (Week 4+) - Long-term
Goal: Achieve SOC2 Type II readiness and continuous security monitoring.
Estimated Time: 16-20 hours
3.1 Cloud Armor (WAF)
Current: No Web Application Firewall
Target: DDoS protection, rate limiting, OWASP Top 10 protection
Implementation:
Create Cloud Armor Security Policy:
# 1. Create policy
gcloud compute security-policies create coditect-api-policy \
--description="Coditect API WAF Policy"
# 2. Add OWASP rules
gcloud compute security-policies rules create 1000 \
--security-policy=coditect-api-policy \
--expression="evaluatePreconfiguredExpr('sqli-stable')" \
--action=deny-403
gcloud compute security-policies rules create 1001 \
--security-policy=coditect-api-policy \
--expression="evaluatePreconfiguredExpr('xss-stable')" \
--action=deny-403
# 3. Rate limiting
gcloud compute security-policies rules create 2000 \
--security-policy=coditect-api-policy \
--expression="true" \
--action=rate-based-ban \
--rate-limit-threshold-count=1000 \
--rate-limit-threshold-interval-sec=60 \
--ban-duration-sec=600
# 4. Attach to backend service
gcloud compute backend-services update <backend-service-name> \
--security-policy=coditect-api-policy \
--global
Timeline: 4-5 hours
3.2 Continuous Compliance Monitoring
Tools: Forseti Security, Config Connector, Security Command Center
Implementation:
Enable Security Command Center:
gcloud services enable securitycenter.googleapis.com
# Enable premium tier (required for full features)
gcloud scc settings update-service-config \
--organization=YOUR_ORG_ID \
--service-enablement-state=ENABLED \
--module-name=CONTAINER_THREAT_DETECTION
Configure Compliance Scanning:
# Scan for CIS GKE Benchmark violations
gcloud scc findings list \
--organization=YOUR_ORG_ID \
--filter="category='CIS_GKE_BENCHMARK'"
Timeline: 3-4 hours (setup); ongoing monitoring
3.3 Intrusion Detection (IDS)
Current: No network-level IDS
Target: Detect malicious traffic patterns
Implementation:
Enable Cloud IDS:
gcloud services enable ids.googleapis.com
# Create IDS endpoint
gcloud ids endpoints create coditect-ids \
--network=default \
--zone=us-central1-a \
--severity=INFORMATIONAL
Configure Alerts:
# Create alert policy for IDS findings
gcloud alpha monitoring policies create \
--notification-channels=EMAIL_CHANNEL_ID \
--display-name="IDS High Severity Alert" \
--condition-display-name="IDS Threat Detected" \
--condition-filter='resource.type="ids.googleapis.com/Endpoint" severity="HIGH"'
Timeline: 4-5 hours
3.4 Backup & Disaster Recovery
Current: No automated backups (StatefulSet PVCs only)
Target: Daily backups with 30-day retention
Implementation:
GKE Backup for Applications:
gcloud services enable gkebackup.googleapis.com
# Create backup plan
gcloud backup-dr backup-plans create coditect-backup-plan \
--project=serene-voltage-464305-n2 \
--location=us-central1 \
--backup-vault=projects/serene-voltage-464305-n2/locations/us-central1/backupVaults/default \
--resource-type=GKE_BACKUP
FoundationDB Backup:
# Use FDB backup tool
kubectl exec -n coditect-app foundationdb-0 -- fdbbackup start \
-d "blobstore://backup@gs://coditect-fdb-backups?bucket=coditect-fdb-backups"
Timeline: 4-6 hours
3.5 Blue-Green Deployment Strategy
Current: Rolling updates (downtime risk)
Target: Zero-downtime deployments with instant rollback
Implementation:
Blue-Green with Separate Deployments:
# deployment-green.yaml (new version)
apiVersion: apps/v1
kind: Deployment
metadata:
name: coditect-api-v5-green
namespace: coditect-app
spec:
replicas: 3
selector:
matchLabels:
app: coditect-api-v5
version: green
template:
metadata:
labels:
app: coditect-api-v5
version: green
spec:
containers:
- name: api
image: gcr.io/serene-voltage-464305-n2/coditect-v5-api:NEW_version
Traffic Switching:
# 1. Deploy green
kubectl apply -f deployment-green.yaml
# 2. Wait for ready
kubectl wait --for=condition=available deployment/coditect-api-v5-green -n coditect-app
# 3. Test green
kubectl port-forward -n coditect-app svc/coditect-api-v5-green 9090:80
curl http://localhost:9090/api/v5/health
# 4. Switch traffic (update service selector)
kubectl patch svc api-loadbalancer -n coditect-app -p '{"spec":{"selector":{"version":"green"}}}'
# 5. Monitor for issues
# If issues: kubectl patch svc api-loadbalancer -n coditect-app -p '{"spec":{"selector":{"version":"blue"}}}'
# 6. Delete blue after verification
kubectl delete deployment coditect-api-v5-blue -n coditect-app
Timeline: 3-4 hours
📋 Implementation Checklist
Phase 1: Critical Security (Week 1)
- 1.1 Migrate JWT secret to Google Secret Manager (2-3h)
- 1.2 Implement NetworkPolicies (4-6h)
- 1.3 Harden firewall rules (restrict SSH) (1h)
- 1.4 Enforce Pod Security Standards (3-4h)
- 1.5 Reserve static IPs for LoadBalancers (1-2h)
Total Phase 1: 11-16 hours
Phase 2: Enhanced Security (Week 2-3)
- 2.1 Enable image vulnerability scanning (3-4h)
- 2.2 Implement Binary Authorization (4-5h)
- 2.3 Migrate to private GKE cluster (6-8h)
- 2.4 Set up secrets rotation (3-10h)
- 2.5 Deploy bastion host (2-3h)
Total Phase 2: 18-30 hours
Phase 3: Compliance & Monitoring (Week 4+)
- 3.1 Deploy Cloud Armor WAF (4-5h)
- 3.2 Enable continuous compliance monitoring (3-4h)
- 3.3 Configure Intrusion Detection System (4-5h)
- 3.4 Implement backup & DR (4-6h)
- 3.5 Set up blue-green deployments (3-4h)
Total Phase 3: 18-24 hours
🎯 Success Criteria
Security Metrics
| Metric | Current | Target | Measurement |
|---|---|---|---|
| Secrets in K8s | 1 (JWT) | 0 | All in Secret Manager |
| Network Policies | 0 | 5+ | kubectl get networkpolicies |
| Critical CVEs | Unknown | 0 | Artifact Registry scan |
| Pod Security | Permissive | Restricted | kubectl get ns -o yaml | grep pod-security |
| SSH Exposure | 0.0.0.0/0 | Specific IPs | gcloud compute firewall-rules list |
| Control Plane | Public | Private | gcloud container clusters describe |
| Static IPs | 0/3 | 3/3 | gcloud compute addresses list |
| WAF Rules | 0 | 10+ | gcloud compute security-policies describe |
| Backup Frequency | Manual | Daily | GKE Backup schedule |
Compliance Benchmarks
- CIS GKE Benchmark: 90%+ compliance
- OWASP Top 10: Mitigations in place for all 10
- SOC2 Type II: Controls documented and auditable
- GDPR: Data encryption at rest/transit, access logs, retention policies
🚨 Rollback Plans
If NetworkPolicies Break Connectivity
# Immediate: Delete deny-all policy
kubectl delete networkpolicy deny-all-ingress -n coditect-app
# Fix: Review logs to identify blocked traffic
kubectl logs -n kube-system -l k8s-app=calico-node --tail=100 | grep DENY
# Update: Add necessary egress rules, reapply
If Binary Authorization Blocks Deployment
# Immediate: Disable enforcement
gcloud container binauthz policy export > policy-backup.yaml
# Edit policy.yaml, set enforcementMode: DRYRUN_AUDIT_LOG_ONLY
gcloud container binauthz policy import policy.yaml
# Fix: Attest the image
# Reapply: Re-enable enforcement after verification
If Private Cluster Causes Access Issues
# Immediate: Add temporary authorized network
gcloud container clusters update codi-poc-e2-cluster \
--enable-master-authorized-networks \
--master-authorized-networks=YOUR_CURRENT_IP/32 \
--zone=us-central1-a
# Long-term: Set up VPN or bastion (see 2.5)
💰 Cost Impact
| Item | Monthly Cost | Notes |
|---|---|---|
| Secret Manager | ~$0.06/secret | 10,000 operations/month free |
| Static IPs | ~$7/IP | 3 IPs = $21/month |
| Cloud Armor | ~$5 + $1/rule | ~$15/month |
| Binary Authorization | Free | Included with GKE |
| Network Policies | Free | Included with GKE |
| Bastion (e2-micro) | ~$7/month | 730 hours |
| Cloud IDS | ~$450/month | $0.60/GB processed |
| GKE Backup | ~$0.10/GB/month | 150GB = $15/month |
Total Estimated Increase: ~$500-550/month (with IDS); ~$50-100/month (without IDS)
Cost Optimization: Skip Cloud IDS for initial phases, rely on Cloud Armor + Security Command Center.
📚 References
- GKE Security Best Practices: https://cloud.google.com/kubernetes-engine/docs/how-to/hardening-your-cluster
- CIS GKE Benchmark: https://www.cisecurity.org/benchmark/kubernetes
- OWASP Kubernetes Security: https://cheatsheetseries.owasp.org/cheatsheets/Kubernetes_Security_Cheat_Sheet.html
- Google Cloud Security: https://cloud.google.com/security/best-practices
- Binary Authorization: https://cloud.google.com/binary-authorization/docs
- Network Policies: https://kubernetes.io/docs/concepts/services-networking/network-policies/
📞 Escalation
Security Incidents:
- Isolate affected pods:
kubectl delete pod <pod-name> - Review Cloud Audit Logs
- Contact security team
- Follow incident response runbook
Critical Issues:
- P0 (service down): Rollback immediately
- P1 (security breach): Isolate, investigate, patch
- P2 (compliance violation): Document, remediate within 7 days
Last Updated: 2025-10-07 Next Review: 2025-10-14 (after Phase 1 completion) Owner: Platform Security Team