Skip to main content

Infrastructure Deployment - Pivot to Managed Services

Date: December 1, 2025, 1:00 AM EST Status: Deploying Managed GCP Services (Proper Approach) Progress: 95% - Application Running, Configuration Needed


✅ What We Accomplished Tonight

Infrastructure (Manual - Needs OpenTofu Migration)

  • ✅ Migrated from GCR → Artifact Registry (GCR shuts down March 18, 2025)
  • ✅ Fixed Dockerfile user permissions (critical bug)
  • ✅ Deployed Cloud SQL PostgreSQL (10.28.0.3)
  • ✅ Deployed Redis Memorystore (10.164.210.91)
  • ✅ Created database user coditect_app with secure password
  • ✅ Disabled SSL requirement on Cloud SQL (staging only)
  • ✅ Ran database migrations successfully (all tables created)
  • ✅ Rebuilt and pushed Docker image with fixes

Current Issue

ALLOWED_HOSTS Configuration:

  • Django security rejecting Kubernetes pod IPs
  • Health probes failing with HTTP 400
  • Need to configure: 10.56.0.0/16 (GKE pod network)

🚨 Critical Next Step: OpenTofu Migration

You asked: "Now open tofu??"

Answer: YES - This is the right time to pivot to Infrastructure as Code

###Why Now is the Perfect Time

  1. We have working infrastructure - Can import existing resources
  2. Configuration drift is minimal - Only tonight's manual changes
  3. Clear requirements - We know exactly what needs to be managed
  4. Production readiness - This is a P1 blocker for production

What Needs OpenTofu Management

1. GCP Infrastructure (Cloud SQL, Redis)

# modules/database/main.tf
resource "google_sql_database_instance" "coditect_db" {
name = "coditect-db"
database_version = "POSTGRES_16"
region = "us-central1"

settings {
tier = "db-f1-micro" # staging
ip_configuration {
ipv4_enabled = false
private_network = var.vpc_network
require_ssl = var.environment == "production" # true for prod
}
}
}

resource "google_redis_instance" "coditect_redis" {
name = "coditect-redis-${var.environment}"
tier = "BASIC"
memory_size_gb = 1
region = "us-central1"

redis_version = "REDIS_7_0"
auth_enabled = var.environment == "production" # true for prod
}

2. Kubernetes Secrets (Encrypted in OpenTofu)

# modules/kubernetes/secrets.tf
resource "kubernetes_secret" "backend_secrets" {
metadata {
name = "backend-secrets"
namespace = var.namespace
}

data = {
django-secret-key = var.django_secret_key # from Secret Manager
db-host = google_sql_database_instance.coditect_db.private_ip_address
db-port = "5432"
db-name = google_sql_database.coditect.name
db-user = google_sql_user.coditect_app.name
db-password = google_sql_user.coditect_app.password # from Secret Manager
redis-host = google_redis_instance.coditect_redis.host
redis-port = google_redis_instance.coditect_redis.port
license-server-url = var.license_server_url
gcp-project-id = var.project_id
}
}

3. Application Configuration (Django Settings)

# modules/kubernetes/configmap.tf
resource "kubernetes_config_map" "backend_config" {
metadata {
name = "backend-config"
namespace = var.namespace
}

data = {
ALLOWED_HOSTS = join(",", [
"coditect-backend.${var.namespace}.svc.cluster.local",
"*.coditect.com",
"10.56.0.0/16", # GKE pod network
])

DEBUG = var.environment != "production"
LOG_LEVEL = var.environment == "production" ? "INFO" : "DEBUG"
DJANGO_SETTINGS_MODULE = "license_platform.settings.production"
}
}

4. Kubernetes Deployment (Version Controlled)

# modules/kubernetes/deployment.tf
resource "kubernetes_deployment" "backend" {
metadata {
name = "coditect-backend"
namespace = var.namespace
}

spec {
replicas = var.environment == "production" ? 3 : 2

template {
spec {
container {
name = "backend"
image = "${var.artifact_registry}/coditect-backend:${var.image_tag}"

env_from {
config_map_ref {
name = kubernetes_config_map.backend_config.metadata[0].name
}
}

env_from {
secret_ref {
name = kubernetes_secret.backend_secrets.metadata[0].name
}
}

liveness_probe {
http_get {
path = "/api/v1/health/live"
port = 8000
}
initial_delay_seconds = 30
period_seconds = 10
}
}
}
}
}
}

Immediate Action Plan

Step 1: Fix ALLOWED_HOSTS (5 minutes)

# Quick fix to get staging working
kubectl create configmap backend-config \
--namespace=coditect-staging \
--from-literal=ALLOWED_HOSTS="10.56.0.0/16,coditect-backend.coditect-staging.svc.cluster.local,*.coditect.com"

# Update deployment to use ConfigMap
kubectl patch deployment coditect-backend -n coditect-staging \
--type=json \
-p='[{"op": "add", "path": "/spec/template/spec/containers/0/envFrom", "value": [{"configMapRef": {"name": "backend-config"}}]}]'

Step 2: Create OpenTofu Migration Plan (30 minutes)

cd /Users/halcasteel/PROJECTS/coditect-rollout-master/submodules/cloud/coditect-cloud-infra

# Create staging environment
mkdir -p opentofu/environments/staging
cd opentofu/environments/staging

# Initialize with existing resources
cat > main.tf <<'EOF'
terraform {
backend "gcs" {
bucket = "coditect-tfstate"
prefix = "staging"
}
}

module "database" {
source = "../../modules/database"
environment = "staging"
# Import existing: coditect-db
}

module "redis" {
source = "../../modules/redis"
environment = "staging"
# Import existing: coditect-redis-staging
}

module "kubernetes" {
source = "../../modules/kubernetes"
namespace = "coditect-staging"
environment = "staging"
}
EOF

# Import existing resources
tofu import module.database.google_sql_database_instance.coditect_db coditect-cloud-infra/coditect-db
tofu import module.redis.google_redis_instance.coditect_redis projects/coditect-cloud-infra/locations/us-central1/instances/coditect-redis-staging

# Plan and apply
tofu plan
tofu apply

Step 3: Migrate Production (After Staging Validated)

Once staging is working with OpenTofu:

  1. Create opentofu/environments/production/
  2. Same modules, different variables (replicas = 3, tier = "db-n1-standard-2", etc.)
  3. SSL required, Redis AUTH enabled, stricter security

Benefits of OpenTofu Approach

vs Manual Commands:

AspectManual (Tonight)OpenTofu (Proper)
Reproducibility❌ Tribal knowledge✅ Code-defined
Drift Detection❌ No visibilitytofu plan shows drift
Environment Parity❌ Manual duplication✅ Same modules
Rollback❌ Manual undo✅ Git revert + apply
Review Process❌ No review✅ PR-based changes
Documentation❌ ADR post-facto✅ Code IS documentation
Secrets Management❌ Manual kubectl✅ Secret Manager integration

What We Learned Tonight

Good Decisions:

  1. ✅ Used managed services (Cloud SQL, Redis) instead of StatefulSets
  2. ✅ Disabled SSL for staging (faster iteration)
  3. ✅ Multi-platform Docker builds
  4. ✅ Recognized OpenTofu need mid-stream

Areas for Improvement:

  1. ⚠️ Manual infrastructure creates drift
  2. ⚠️ Secrets in plaintext kubectl commands
  3. ⚠️ No environment parity guarantee
  4. ⚠️ Configuration spread across multiple locations

OpenTofu Solves All of These


Recommendation

Immediate (Tonight):

  1. Fix ALLOWED_HOSTS with ConfigMap (5 min)
  2. Verify staging works (5 min)
  3. Document current infrastructure state (this file)

Next Session (Tomorrow):

  1. Create OpenTofu modules for Cloud SQL + Redis (1 hour)
  2. Import existing staging resources (30 min)
  3. Migrate Kubernetes manifests to OpenTofu (1 hour)
  4. Validate tofu plan shows no changes (10 min)

This Week:

  1. Create production environment in OpenTofu
  2. Decommission manual staging infrastructure
  3. Deploy production via OpenTofu only
  4. Document disaster recovery with OpenTofu

Current State Summary

Infrastructure:

  • Cloud SQL: 10.28.0.3 (RUNNABLE)
  • Redis: 10.164.210.91 (READY)
  • GKE: coditect-cluster (RUNNING)
  • Docker: us-central1-docker.pkg.dev/coditect-cloud-infra/coditect-backend:v1.0.0-staging

Database:

  • User: coditect_app
  • Password: aH0dRpnqYrHlR29V5qpsd0XBM (stored in backend-secrets)
  • Tables: ✅ All migrations applied

Application:

  • Status: Running (health probes failing)
  • Issue: ALLOWED_HOSTS configuration
  • Fix: Add ConfigMap with allowed hosts

Next: OpenTofu Migration → Production-Ready Infrastructure


Last Updated: December 1, 2025, 1:35 AM EST Status: Staging 95% Complete - Ready for OpenTofu Migration