Skip to main content

tasklist-v2.md - GCP Deployment Tasks

CODITECT DMS - OpenTofu Deployment Checklist

Version: 2.0.0 Created: 2025-12-28 Total Tasks: 147 Status: Ready for Execution


Progress Summary

PhaseTasksCompletedProgress
Phase 1: Prerequisites2800%
Phase 2: Infrastructure3500%
Phase 3: Application2500%
Phase 4: Database1500%
Phase 5: DNS & SSL1200%
Phase 6: Monitoring1500%
Phase 7: Validation1700%
Total14700%

Phase 1: Prerequisites & Setup (28 Tasks)

1.1 Tool Installation

  • Install OpenTofu >= 1.6.0
  • Install Google Cloud SDK >= 450.0.0
  • Install kubectl >= 1.28.0
  • Install Helm >= 3.12.0
  • Install Docker >= 24.0.0
  • Install k6 for load testing
  • Verify all tool versions

1.2 GCP Authentication

  • Run gcloud auth login
  • Run gcloud auth application-default login
  • Verify authentication with gcloud auth list
  • Configure default project

1.3 GCP Project Creation

  • Create coditect-dev project
  • Create coditect-staging project
  • Create coditect-prod project
  • Link billing account to coditect-dev
  • Link billing account to coditect-staging
  • Link billing account to coditect-prod

1.4 Enable GCP APIs (Per Project x 3)

  • Enable Container API (coditect-dev)
  • Enable Container API (coditect-staging)
  • Enable Container API (coditect-prod)
  • Enable SQL Admin API (all projects)
  • Enable Redis API (all projects)
  • Enable Secret Manager API (all projects)
  • Enable Cloud Armor API (all projects)
  • Enable Monitoring API (all projects)
  • Enable AI Platform API (all projects)

1.5 Terraform State Buckets

  • Create gs://coditect-dev-terraform-state
  • Create gs://coditect-staging-terraform-state
  • Create gs://coditect-prod-terraform-state
  • Enable versioning on all state buckets

Phase 2: Infrastructure Deployment (35 Tasks)

2.1 Development Environment

  • Initialize OpenTofu for dev
  • Run tofu plan for dev environment
  • Review dev plan output
  • Apply dev infrastructure
  • Verify VPC created
  • Verify GKE cluster running
  • Verify Cloud SQL instance running
  • Verify Redis instance running
  • Verify all secrets created
  • Save dev outputs

2.2 Staging Environment

  • Initialize OpenTofu for staging
  • Run tofu plan for staging environment
  • Review staging plan output
  • Apply staging infrastructure
  • Verify VPC created
  • Verify GKE cluster running
  • Verify Cloud SQL instance running
  • Verify Redis instance running
  • Verify all secrets created
  • Save staging outputs

2.3 Production Environment

  • Initialize OpenTofu for prod
  • Run tofu plan for prod environment
  • Review prod plan with team
  • Get approval from Platform Lead
  • Get approval from Security Lead
  • Apply prod infrastructure
  • Verify VPC created with private subnets
  • Verify GKE cluster running (private nodes)
  • Verify Cloud SQL instance running (HA)
  • Verify Redis instance running (HA)
  • Verify Cloud Armor policy active
  • Verify all secrets created
  • Save prod outputs
  • Document infrastructure endpoints
  • Update runbooks with actual resource names

Phase 3: Application Deployment (25 Tasks)

3.1 Kubernetes Configuration

  • Get GKE credentials for dev
  • Get GKE credentials for staging
  • Get GKE credentials for prod
  • Verify kubectl connectivity to all clusters
  • Create coditect-dms namespace (dev)
  • Create coditect-dms namespace (staging)
  • Create coditect-dms namespace (prod)

3.2 Secrets Configuration

  • Create Kubernetes secrets from Secret Manager (dev)
  • Create Kubernetes secrets from Secret Manager (staging)
  • Create Kubernetes secrets from Secret Manager (prod)
  • Set OpenAI API key in Secret Manager
  • Set Stripe API keys in Secret Manager

3.3 Docker Images

  • Configure Docker for GCR authentication
  • Build API Docker image
  • Tag image with version
  • Push image to gcr.io/coditect-prod
  • Verify image in Container Registry

3.4 Kubernetes Deployment

  • Deploy to dev environment
  • Verify dev pods running
  • Deploy to staging environment
  • Verify staging pods running
  • Deploy to prod environment
  • Verify prod pods running
  • Verify HPA configured correctly
  • Verify pod resource limits applied

Phase 4: Database Setup (15 Tasks)

4.1 Cloud SQL Proxy

  • Download Cloud SQL Proxy
  • Configure proxy for dev
  • Configure proxy for staging
  • Configure proxy for prod

4.2 PostgreSQL Extensions

  • Enable pgvector extension (dev)
  • Enable pgvector extension (staging)
  • Enable pgvector extension (prod)
  • Verify vector extension installed

4.3 Database Migrations

  • Test migrations on dev
  • Run migrations on dev
  • Test migrations on staging
  • Run migrations on staging
  • Run migrations on prod
  • Verify all tables created
  • Verify indexes created

Phase 5: DNS & SSL (12 Tasks)

5.1 Static IP Verification

  • Verify static IP for dev
  • Verify static IP for staging
  • Verify static IP for prod

5.2 DNS Configuration

  • Create/verify DNS zone
  • Add A record for dms-api-dev.coditect.ai
  • Add A record for dms-api-staging.coditect.ai
  • Add A record for dms-api.coditect.ai
  • Verify DNS propagation

5.3 SSL Certificates

  • Verify managed certificate for dev
  • Verify managed certificate for staging
  • Verify managed certificate for prod
  • Test HTTPS connectivity

Phase 6: Monitoring & Alerting (15 Tasks)

6.1 Alert Verification

  • Verify high error rate alert created
  • Verify high latency alert created
  • Verify high CPU alert created
  • Verify high memory alert created
  • Verify Cloud SQL connection alert created
  • Verify Redis memory alert created

6.2 Notification Channels

  • Configure email notifications
  • Configure Slack webhook
  • Configure PagerDuty integration
  • Test alert notifications

6.3 Dashboards

  • Verify Cloud Monitoring dashboard created
  • Import Grafana dashboards (if using kube-prometheus)
  • Configure SLO dashboards
  • Set up log-based metrics

6.4 Uptime Checks

  • Verify health check uptime monitor
  • Verify readiness check uptime monitor

Phase 7: Validation & Go-Live (17 Tasks)

7.1 Smoke Tests

  • Test /health endpoint
  • Test /health/ready endpoint
  • Test /docs (Swagger UI)
  • Test /redoc (ReDoc)
  • Test API authentication
  • Test document upload
  • Test semantic search

7.2 Load Testing

  • Run k6 load test (100 users)
  • Run k6 load test (500 users)
  • Analyze load test results
  • Verify autoscaling triggers

7.3 Security Validation

  • Run OWASP ZAP scan
  • Verify Cloud Armor blocking attacks
  • Verify rate limiting working
  • Review security audit logs

7.4 Go-Live

  • Get final sign-off from Platform Lead
  • Get final sign-off from Security
  • Update status page to "Operational"
  • Announce launch to stakeholders

Post-Deployment Tasks

Documentation

  • Update runbooks with actual values
  • Document all endpoints
  • Create troubleshooting guide
  • Record deployment video/walkthrough

Training

  • Train on-call team
  • Train support team
  • Create FAQ document

Handoff

  • Transfer ownership to operations
  • Schedule post-mortem meeting
  • Document lessons learned

Quick Commands Reference

OpenTofu Commands

# Initialize
tofu init -backend-config="bucket=coditect-prod-terraform-state"

# Plan
tofu plan -var-file=environments/prod/terraform.tfvars -out=prod.tfplan

# Apply
tofu apply prod.tfplan

# Destroy (DANGEROUS)
tofu destroy -var-file=environments/prod/terraform.tfvars

Kubernetes Commands

# Get credentials
gcloud container clusters get-credentials coditect-dms-prod-cluster --region us-central1

# Deploy
kubectl apply -f deploy/kubernetes/

# Check status
kubectl get pods -n coditect-dms
kubectl get svc -n coditect-dms
kubectl get ingress -n coditect-dms

# Logs
kubectl logs -f deployment/coditect-dms-api -n coditect-dms

# Scale
kubectl scale deployment coditect-dms-api --replicas=10 -n coditect-dms

Database Commands

# Start proxy
cloud_sql_proxy -instances=coditect-prod:us-central1:coditect-dms-db-xxxx=tcp:5432

# Connect
psql -h localhost -U dms_admin -d coditect_dms

# Run migrations
alembic upgrade head

Rollback Checklist

Application Rollback

  • Identify problematic deployment
  • Run kubectl rollout undo
  • Verify rollback successful
  • Notify stakeholders

Infrastructure Rollback

  • Identify problematic change in state
  • Run targeted tofu apply to revert
  • Verify infrastructure restored
  • Update documentation

Database Rollback

  • Identify migration to revert
  • Run alembic downgrade
  • Or restore from backup
  • Verify data integrity

Sign-Off

RoleNameDateSignature
Platform Lead
Security Lead
Engineering Manager
CTOHal Casteel

Document Version: 2.0.0 Last Updated: 2025-12-28 Next Review: Weekly during deployment