Infrastructure Deployment - Pivot to Managed Services
Date: December 1, 2025, 1:00 AM EST Status: Deploying Managed GCP Services (Proper Approach) Progress: 95% - Application Running, Configuration Needed
✅ What We Accomplished Tonight
Infrastructure (Manual - Needs OpenTofu Migration)
- ✅ Migrated from GCR → Artifact Registry (GCR shuts down March 18, 2025)
- ✅ Fixed Dockerfile user permissions (critical bug)
- ✅ Deployed Cloud SQL PostgreSQL (10.28.0.3)
- ✅ Deployed Redis Memorystore (10.164.210.91)
- ✅ Created database user
coditect_appwith secure password - ✅ Disabled SSL requirement on Cloud SQL (staging only)
- ✅ Ran database migrations successfully (all tables created)
- ✅ Rebuilt and pushed Docker image with fixes
Current Issue
ALLOWED_HOSTS Configuration:
- Django security rejecting Kubernetes pod IPs
- Health probes failing with HTTP 400
- Need to configure:
10.56.0.0/16(GKE pod network)
🚨 Critical Next Step: OpenTofu Migration
You asked: "Now open tofu??"
Answer: YES - This is the right time to pivot to Infrastructure as Code
###Why Now is the Perfect Time
- We have working infrastructure - Can import existing resources
- Configuration drift is minimal - Only tonight's manual changes
- Clear requirements - We know exactly what needs to be managed
- Production readiness - This is a P1 blocker for production
What Needs OpenTofu Management
1. GCP Infrastructure (Cloud SQL, Redis)
# modules/database/main.tf
resource "google_sql_database_instance" "coditect_db" {
name = "coditect-db"
database_version = "POSTGRES_16"
region = "us-central1"
settings {
tier = "db-f1-micro" # staging
ip_configuration {
ipv4_enabled = false
private_network = var.vpc_network
require_ssl = var.environment == "production" # true for prod
}
}
}
resource "google_redis_instance" "coditect_redis" {
name = "coditect-redis-${var.environment}"
tier = "BASIC"
memory_size_gb = 1
region = "us-central1"
redis_version = "REDIS_7_0"
auth_enabled = var.environment == "production" # true for prod
}
2. Kubernetes Secrets (Encrypted in OpenTofu)
# modules/kubernetes/secrets.tf
resource "kubernetes_secret" "backend_secrets" {
metadata {
name = "backend-secrets"
namespace = var.namespace
}
data = {
django-secret-key = var.django_secret_key # from Secret Manager
db-host = google_sql_database_instance.coditect_db.private_ip_address
db-port = "5432"
db-name = google_sql_database.coditect.name
db-user = google_sql_user.coditect_app.name
db-password = google_sql_user.coditect_app.password # from Secret Manager
redis-host = google_redis_instance.coditect_redis.host
redis-port = google_redis_instance.coditect_redis.port
license-server-url = var.license_server_url
gcp-project-id = var.project_id
}
}
3. Application Configuration (Django Settings)
# modules/kubernetes/configmap.tf
resource "kubernetes_config_map" "backend_config" {
metadata {
name = "backend-config"
namespace = var.namespace
}
data = {
ALLOWED_HOSTS = join(",", [
"coditect-backend.${var.namespace}.svc.cluster.local",
"*.coditect.com",
"10.56.0.0/16", # GKE pod network
])
DEBUG = var.environment != "production"
LOG_LEVEL = var.environment == "production" ? "INFO" : "DEBUG"
DJANGO_SETTINGS_MODULE = "license_platform.settings.production"
}
}
4. Kubernetes Deployment (Version Controlled)
# modules/kubernetes/deployment.tf
resource "kubernetes_deployment" "backend" {
metadata {
name = "coditect-backend"
namespace = var.namespace
}
spec {
replicas = var.environment == "production" ? 3 : 2
template {
spec {
container {
name = "backend"
image = "${var.artifact_registry}/coditect-backend:${var.image_tag}"
env_from {
config_map_ref {
name = kubernetes_config_map.backend_config.metadata[0].name
}
}
env_from {
secret_ref {
name = kubernetes_secret.backend_secrets.metadata[0].name
}
}
liveness_probe {
http_get {
path = "/api/v1/health/live"
port = 8000
}
initial_delay_seconds = 30
period_seconds = 10
}
}
}
}
}
}
Immediate Action Plan
Step 1: Fix ALLOWED_HOSTS (5 minutes)
# Quick fix to get staging working
kubectl create configmap backend-config \
--namespace=coditect-staging \
--from-literal=ALLOWED_HOSTS="10.56.0.0/16,coditect-backend.coditect-staging.svc.cluster.local,*.coditect.com"
# Update deployment to use ConfigMap
kubectl patch deployment coditect-backend -n coditect-staging \
--type=json \
-p='[{"op": "add", "path": "/spec/template/spec/containers/0/envFrom", "value": [{"configMapRef": {"name": "backend-config"}}]}]'
Step 2: Create OpenTofu Migration Plan (30 minutes)
cd /Users/halcasteel/PROJECTS/coditect-rollout-master/submodules/cloud/coditect-cloud-infra
# Create staging environment
mkdir -p opentofu/environments/staging
cd opentofu/environments/staging
# Initialize with existing resources
cat > main.tf <<'EOF'
terraform {
backend "gcs" {
bucket = "coditect-tfstate"
prefix = "staging"
}
}
module "database" {
source = "../../modules/database"
environment = "staging"
# Import existing: coditect-db
}
module "redis" {
source = "../../modules/redis"
environment = "staging"
# Import existing: coditect-redis-staging
}
module "kubernetes" {
source = "../../modules/kubernetes"
namespace = "coditect-staging"
environment = "staging"
}
EOF
# Import existing resources
tofu import module.database.google_sql_database_instance.coditect_db coditect-cloud-infra/coditect-db
tofu import module.redis.google_redis_instance.coditect_redis projects/coditect-cloud-infra/locations/us-central1/instances/coditect-redis-staging
# Plan and apply
tofu plan
tofu apply
Step 3: Migrate Production (After Staging Validated)
Once staging is working with OpenTofu:
- Create
opentofu/environments/production/ - Same modules, different variables (
replicas = 3,tier = "db-n1-standard-2", etc.) - SSL required, Redis AUTH enabled, stricter security
Benefits of OpenTofu Approach
vs Manual Commands:
| Aspect | Manual (Tonight) | OpenTofu (Proper) |
|---|---|---|
| Reproducibility | ❌ Tribal knowledge | ✅ Code-defined |
| Drift Detection | ❌ No visibility | ✅ tofu plan shows drift |
| Environment Parity | ❌ Manual duplication | ✅ Same modules |
| Rollback | ❌ Manual undo | ✅ Git revert + apply |
| Review Process | ❌ No review | ✅ PR-based changes |
| Documentation | ❌ ADR post-facto | ✅ Code IS documentation |
| Secrets Management | ❌ Manual kubectl | ✅ Secret Manager integration |
What We Learned Tonight
Good Decisions:
- ✅ Used managed services (Cloud SQL, Redis) instead of StatefulSets
- ✅ Disabled SSL for staging (faster iteration)
- ✅ Multi-platform Docker builds
- ✅ Recognized OpenTofu need mid-stream
Areas for Improvement:
- ⚠️ Manual infrastructure creates drift
- ⚠️ Secrets in plaintext kubectl commands
- ⚠️ No environment parity guarantee
- ⚠️ Configuration spread across multiple locations
OpenTofu Solves All of These ✅
Recommendation
Immediate (Tonight):
- Fix ALLOWED_HOSTS with ConfigMap (5 min)
- Verify staging works (5 min)
- Document current infrastructure state (this file)
Next Session (Tomorrow):
- Create OpenTofu modules for Cloud SQL + Redis (1 hour)
- Import existing staging resources (30 min)
- Migrate Kubernetes manifests to OpenTofu (1 hour)
- Validate
tofu planshows no changes (10 min)
This Week:
- Create production environment in OpenTofu
- Decommission manual staging infrastructure
- Deploy production via OpenTofu only
- Document disaster recovery with OpenTofu
Current State Summary
Infrastructure:
- Cloud SQL:
10.28.0.3(RUNNABLE) - Redis:
10.164.210.91(READY) - GKE:
coditect-cluster(RUNNING) - Docker:
us-central1-docker.pkg.dev/coditect-cloud-infra/coditect-backend:v1.0.0-staging
Database:
- User:
coditect_app - Password:
aH0dRpnqYrHlR29V5qpsd0XBM(stored in backend-secrets) - Tables: ✅ All migrations applied
Application:
- Status: Running (health probes failing)
- Issue: ALLOWED_HOSTS configuration
- Fix: Add ConfigMap with allowed hosts
Next: OpenTofu Migration → Production-Ready Infrastructure
Last Updated: December 1, 2025, 1:35 AM EST Status: Staging 95% Complete - Ready for OpenTofu Migration