Skip to main content

CODITECT Cloud Backend - Staging Deployment Guide

Document Status: Production Last Updated: December 1, 2025 Environment: Google Kubernetes Engine (GKE) Staging Purpose: Complete step-by-step deployment from zero to working staging environment


Table of Contents

  1. Deployment Overview
  2. Prerequisites
  3. Infrastructure Setup
  4. Application Deployment
  5. Validation Checklist
  6. Key Configuration Files
  7. Environment-Specific Notes
  8. Troubleshooting
  9. Next Steps

Deployment Overview

This guide walks through deploying the CODITECT Cloud Backend to a GKE staging environment from scratch. The process takes approximately 30-45 minutes for a new environment.

Architecture

┌─────────────────────────────────────────┐
│ Google Kubernetes Engine (GKE) │
│ │
│ ┌───────────────────────────────────┐ │
│ │ coditect-staging namespace │ │
│ │ │ │
│ │ ┌─────────────────────────────┐ │ │
│ │ │ Backend Deployment (2 pods)│ │ │
│ │ │ - Django 5.2.8 │ │ │
│ │ │ - Python 3.12.12 │ │ │
│ │ │ - Gunicorn WSGI │ │ │
│ │ └─────────────────────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────┐ │ │
│ │ │ LoadBalancer Service │ │ │
│ │ │ - External IP │ │ │
│ │ │ - HTTP/HTTPS endpoints │ │ │
│ │ └─────────────────────────────┘ │ │
│ └───────────────────────────────────┘ │
└─────────────────────────────────────────┘
│ │
│ │
v v
┌──────────────────┐ ┌──────────────────┐
│ Cloud SQL │ │ Redis │
│ PostgreSQL 16 │ │ Memorystore 7.0 │
│ (Private IP) │ │ (Private IP) │
└──────────────────┘ └──────────────────┘

Deployment Timeline

PhaseDurationDescription
Prerequisites5 minTool installation and authentication
Infrastructure15-20 minGCP resources (GKE, Cloud SQL, Redis)
Application10-15 minDocker build, K8s deployment, migrations
Validation5 minHealth checks and smoke tests

Prerequisites

Required Tools

# Verify tool versions
gcloud version # Google Cloud SDK 450.0.0+
kubectl version --client # Kubernetes CLI 1.28+
docker version # Docker 24.0+
openssl version # OpenSSL 3.0+

Install missing tools:

# Google Cloud SDK
curl https://sdk.cloud.google.com | bash
exec -l $SHELL

# kubectl (via gcloud)
gcloud components install kubectl

# Docker (macOS)
brew install --cask docker

# OpenSSL (usually pre-installed)
brew install openssl

GCP Project Setup

# Set your GCP project
export PROJECT_ID="coditect-cloud-infra"
gcloud config set project $PROJECT_ID

# Authenticate
gcloud auth login
gcloud auth application-default login

# Enable required APIs
gcloud services enable \
container.googleapis.com \
sqladmin.googleapis.com \
redis.googleapis.com \
artifactregistry.googleapis.com \
secretmanager.googleapis.com \
compute.googleapis.com

IAM Permissions Required

Your GCP user/service account needs these roles:

  • roles/container.admin - GKE cluster creation
  • roles/cloudsql.admin - Cloud SQL instance management
  • roles/redis.admin - Redis instance management
  • roles/artifactregistry.writer - Docker image push
  • roles/secretmanager.admin - Secret management
  • roles/iam.serviceAccountUser - Service account usage

Verify permissions:

gcloud projects get-iam-policy $PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:user:$(gcloud config get-value account)"

Network Configuration

Default VPC requirements:

  • Default VPC network must exist
  • Firewall rules allow GKE control plane communication
  • Private Google Access enabled for Cloud SQL/Redis

Verify network:

# Check default VPC exists
gcloud compute networks describe default

# Enable Private Google Access
gcloud compute networks subnets update default \
--region=us-central1 \
--enable-private-ip-google-access

Infrastructure Setup

Note: This section uses manual gcloud commands. OpenTofu (Terraform) migration is planned but not yet implemented. Follow these steps exactly as written.

Step 1: Create GKE Cluster

If cluster does not exist, create it:

# Create GKE Standard cluster
gcloud container clusters create coditect-staging-cluster \
--region us-central1 \
--num-nodes 2 \
--machine-type n1-standard-2 \
--disk-size 50 \
--disk-type pd-standard \
--enable-autorepair \
--enable-autoupgrade \
--enable-ip-alias \
--network default \
--subnetwork default \
--addons HorizontalPodAutoscaling,HttpLoadBalancing,GcePersistentDiskCsiDriver \
--workload-pool=${PROJECT_ID}.svc.id.goog \
--enable-shielded-nodes \
--labels environment=staging,team=backend,project=coditect-cloud-backend

If cluster exists, get credentials:

gcloud container clusters get-credentials coditect-staging-cluster \
--region us-central1

Verify cluster access:

kubectl cluster-info
kubectl get nodes

Expected output:

NAME                                                  STATUS   ROLES    AGE   VERSION
gke-coditect-staging-cluster-default-pool-xxxxx-yyyy Ready <none> 1m v1.28.x
gke-coditect-staging-cluster-default-pool-xxxxx-zzzz Ready <none> 1m v1.28.x

Time required: 8-10 minutes


Step 2: Create Artifact Registry Repository

# Create Docker repository
gcloud artifacts repositories create coditect-backend \
--repository-format=docker \
--location=us-central1 \
--description="CODITECT Cloud Backend Docker images"

# Verify creation
gcloud artifacts repositories list --location=us-central1

Expected output:

REPOSITORY        FORMAT  MODE                 LOCATION      ...
coditect-backend DOCKER STANDARD_REPOSITORY us-central1 ...

Time required: 30 seconds


Step 3: Create Cloud SQL PostgreSQL Instance

# Create PostgreSQL 16 instance with private IP
gcloud sql instances create coditect-db \
--database-version=POSTGRES_16 \
--tier=db-f1-micro \
--region=us-central1 \
--network=projects/${PROJECT_ID}/global/networks/default \
--no-assign-ip \
--database-flags=max_connections=100 \
--backup-start-time=03:00 \
--labels environment=staging,team=backend,project=coditect-cloud-backend

# Wait for creation to complete (5-8 minutes)
gcloud sql operations list --instance=coditect-db --limit=1

Verify instance:

gcloud sql instances describe coditect-db --format="value(state)"
# Should output: RUNNABLE

Time required: 5-8 minutes


Step 4: Create Redis Memorystore Instance

# Create Redis 7.0 instance
gcloud redis instances create coditect-redis-staging \
--size=1 \
--region=us-central1 \
--redis-version=redis_7_0 \
--network=projects/${PROJECT_ID}/global/networks/default \
--labels environment=staging,team=backend,project=coditect-cloud-backend

# Wait for creation to complete (3-5 minutes)
gcloud redis operations list --region=us-central1 --limit=1

Verify instance:

gcloud redis instances describe coditect-redis-staging \
--region=us-central1 \
--format="value(state)"
# Should output: READY

Time required: 3-5 minutes


Step 5: Configure Cloud SQL Database

# Create database
gcloud sql databases create coditect \
--instance=coditect-db

# Generate secure password (save this!)
DB_PASSWORD=$(openssl rand -base64 32 | tr -d "=+/" | cut -c1-25)
echo "Database password: $DB_PASSWORD"
echo "$DB_PASSWORD" > .db-password
chmod 600 .db-password

# Create database user
gcloud sql users create coditect_app \
--instance=coditect-db \
--password="$DB_PASSWORD"

# Verify user creation
gcloud sql users list --instance=coditect-db

Expected output:

NAME          HOST  TYPE
coditect_app % BUILT_IN
postgres % BUILT_IN

IMPORTANT: Save the password from .db-password - you'll need it in Step 7.

Time required: 1 minute


Application Deployment

Step 6: Build and Push Docker Image

# Navigate to repository root
cd /Users/halcasteel/PROJECTS/coditect-rollout-master/submodules/cloud/coditect-cloud-backend

# Authenticate Docker to Artifact Registry
gcloud auth configure-docker us-central1-docker.pkg.dev

# Build multi-platform image for GKE (linux/amd64)
# Note: If on Apple Silicon, buildx ensures correct architecture
docker buildx create --use --name multiarch || docker buildx use multiarch

docker buildx build \
--platform linux/amd64 \
-t us-central1-docker.pkg.dev/${PROJECT_ID}/coditect-backend/coditect-cloud-backend:v1.0.0-staging \
--push \
.

# Verify image uploaded
gcloud artifacts docker images list \
us-central1-docker.pkg.dev/${PROJECT_ID}/coditect-backend/coditect-cloud-backend

Expected output:

IMAGE                                                                             DIGEST        CREATE_TIME          UPDATE_TIME
us-central1-docker.pkg.dev/.../coditect-cloud-backend:v1.0.0-staging sha256:... 2025-12-01T... 2025-12-01T...

Time required: 3-5 minutes (depending on network speed)

Troubleshooting: See staging-troubleshooting-guide.md Issue #2 (Multi-Platform Build) and Issue #3 (User Permissions)


Step 7: Create Kubernetes Namespace and Secrets

# Create namespace
kubectl apply -f deployment/kubernetes/staging/namespace.yaml

# Verify namespace
kubectl get namespace coditect-staging

Get infrastructure endpoints:

# Get Cloud SQL private IP
DB_HOST=$(gcloud sql instances describe coditect-db \
--format="value(ipAddresses[0].ipAddress)")
echo "Database host: $DB_HOST"

# Get Redis private IP
REDIS_HOST=$(gcloud redis instances describe coditect-redis-staging \
--region=us-central1 \
--format="value(host)")
echo "Redis host: $REDIS_HOST"

# Load database password
DB_PASSWORD=$(cat .db-password)

Create Kubernetes secrets:

# Generate Django secret key
DJANGO_SECRET=$(openssl rand -base64 50)

# Create secret with all credentials
kubectl create secret generic backend-secrets \
--namespace=coditect-staging \
--from-literal=django-secret-key="$DJANGO_SECRET" \
--from-literal=db-host="$DB_HOST" \
--from-literal=db-port="5432" \
--from-literal=db-name="coditect" \
--from-literal=db-user="coditect_app" \
--from-literal=db-password="$DB_PASSWORD" \
--from-literal=redis-host="$REDIS_HOST" \
--from-literal=redis-port="6379" \
--from-literal=license-server-url="https://license.coditect.com" \
--from-literal=gcp-project-id="$PROJECT_ID"

# Verify secret created
kubectl get secret backend-secrets -n coditect-staging

Expected output:

NAME              TYPE     DATA   AGE
backend-secrets Opaque 10 5s

Time required: 1 minute


Step 8: Deploy Application

# Deploy all Kubernetes resources
kubectl apply -f deployment/kubernetes/staging/backend-config.yaml
kubectl apply -f deployment/kubernetes/staging/backend-deployment.yaml
kubectl apply -f deployment/kubernetes/staging/backend-service.yaml

# Watch deployment progress
kubectl get pods -n coditect-staging --watch

Expected progression:

NAME                                READY   STATUS              RESTARTS   AGE
coditect-backend-xxxxxxxxxx-xxxxx 0/2 ContainerCreating 0 5s
coditect-backend-xxxxxxxxxx-xxxxx 0/2 Running 0 15s
coditect-backend-xxxxxxxxxx-xxxxx 1/2 Running 0 25s
coditect-backend-xxxxxxxxxx-xxxxx 2/2 Running 0 35s
coditect-backend-xxxxxxxxxx-yyyyy 0/2 ContainerCreating 0 5s
coditect-backend-xxxxxxxxxx-yyyyy 0/2 Running 0 20s
coditect-backend-xxxxxxxxxx-yyyyy 1/2 Running 0 30s
coditect-backend-xxxxxxxxxx-yyyyy 2/2 Running 0 40s

Press Ctrl+C once both pods show 2/2 Running

Verify deployment:

# Check deployment status
kubectl get deployment coditect-backend -n coditect-staging

# Check pod details
kubectl get pods -n coditect-staging -l app=coditect-backend

# Check recent logs
kubectl logs -n coditect-staging -l app=coditect-backend --tail=50

Time required: 2-3 minutes

Troubleshooting: See staging-troubleshooting-guide.md:

  • Issue #1 (GCR Deprecation) if image pull fails
  • Issue #4 (Cloud SQL SSL) if database connection fails
  • Issue #5 (Database Auth) if authentication fails
  • Issue #6 (ALLOWED_HOSTS) if health probes fail
  • Issue #7 (HTTP/HTTPS) if liveness/readiness probes fail

Step 9: Run Database Migrations

# Apply migration job
kubectl apply -f deployment/kubernetes/staging/migrate-job.yaml

# Watch migration progress
kubectl logs -n coditect-staging job/django-migrate --follow

Expected output:

Operations to perform:
Apply all migrations: admin, auth, contenttypes, sessions, users, orgs, licenses, projects
Running migrations:
Applying contenttypes.0001_initial... OK
Applying auth.0001_initial... OK
Applying users.0001_initial... OK
Applying orgs.0001_initial... OK
Applying licenses.0001_initial... OK
Applying projects.0001_initial... OK
... (more migrations)

Migration completed successfully.

Verify migrations:

# Check job completion
kubectl get job django-migrate -n coditect-staging

# Expected: COMPLETIONS = 1/1

Time required: 30-60 seconds

Note: Migration job auto-deletes after 5 minutes (ttlSecondsAfterFinished: 300)


Step 10: Verify Deployment

Check all resources:

# Deployment status
kubectl get deployment coditect-backend -n coditect-staging

# Pod status
kubectl get pods -n coditect-staging -l app=coditect-backend

# Service status
kubectl get service coditect-backend -n coditect-staging

Expected outputs:

# Deployment
NAME READY UP-TO-DATE AVAILABLE AGE
coditect-backend 2/2 2 2 5m

# Pods
NAME READY STATUS RESTARTS AGE
coditect-backend-xxxxxxxxxx-xxxxx 2/2 Running 0 5m
coditect-backend-xxxxxxxxxx-yyyyy 2/2 Running 0 5m

# Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coditect-backend LoadBalancer 10.xx.xxx.xxx 34.xxx.xxx.xxx 80:xxxxx/TCP,443:xxxxx/TCP 5m

Get external IP:

# Wait for external IP assignment (1-2 minutes)
kubectl get service coditect-backend -n coditect-staging --watch

Test health endpoint:

# Get external IP
EXTERNAL_IP=$(kubectl get service coditect-backend -n coditect-staging \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}')

echo "Backend URL: http://$EXTERNAL_IP"

# Test health endpoint
curl -v http://$EXTERNAL_IP/api/v1/health/

# Expected response: HTTP 200 with JSON body
# {"status": "healthy", "timestamp": "2025-12-01T...", ...}

Check logs for errors:

# Recent application logs
kubectl logs -n coditect-staging -l app=coditect-backend --tail=100

# Follow logs in real-time
kubectl logs -n coditect-staging -l app=coditect-backend --follow

Time required: 2-3 minutes


Validation Checklist

Use this checklist to verify successful deployment:

Infrastructure Validation

  • GKE cluster running with 2 nodes
  • Cloud SQL PostgreSQL instance state = RUNNABLE
  • Redis Memorystore instance state = READY
  • Artifact Registry repository exists
  • All GCP APIs enabled
# Quick validation script
echo "Cluster nodes:"
kubectl get nodes

echo -e "\nCloud SQL status:"
gcloud sql instances describe coditect-db --format="value(state)"

echo -e "\nRedis status:"
gcloud redis instances describe coditect-redis-staging --region=us-central1 --format="value(state)"

echo -e "\nArtifact Registry:"
gcloud artifacts repositories list --location=us-central1 --filter="name:coditect-backend"

Application Validation

  • Namespace coditect-staging exists
  • Secret backend-secrets created with 10 keys
  • Deployment coditect-backend shows READY 2/2
  • All pods show STATUS = Running and READY = 2/2
  • Service has external IP assigned
  • Migration job completed successfully (1/1)
# Quick validation script
echo "Namespace:"
kubectl get namespace coditect-staging

echo -e "\nSecret:"
kubectl get secret backend-secrets -n coditect-staging

echo -e "\nDeployment:"
kubectl get deployment coditect-backend -n coditect-staging

echo -e "\nPods:"
kubectl get pods -n coditect-staging -l app=coditect-backend

echo -e "\nService:"
kubectl get service coditect-backend -n coditect-staging

echo -e "\nMigration job:"
kubectl get job django-migrate -n coditect-staging 2>/dev/null || echo "Job already cleaned up (expected)"

Health Validation

  • Liveness probe passing (/api/v1/health/live)
  • Readiness probe passing (/api/v1/health/ready)
  • Health endpoint responds: curl http://<EXTERNAL-IP>/api/v1/health/
  • No CrashLoopBackOff or Error pod states
  • Logs show successful gunicorn startup
# Quick validation script
EXTERNAL_IP=$(kubectl get service coditect-backend -n coditect-staging -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

echo "External IP: $EXTERNAL_IP"
echo -e "\nHealth check:"
curl -s http://$EXTERNAL_IP/api/v1/health/ | python3 -m json.tool

echo -e "\nRecent logs:"
kubectl logs -n coditect-staging -l app=coditect-backend --tail=20

Complete Validation Script

#!/bin/bash
# Save as: validate-staging-deployment.sh

set -e

echo "=== CODITECT Cloud Backend Staging Validation ==="
echo

echo "[1/5] Checking GCP Infrastructure..."
gcloud sql instances describe coditect-db --format="value(state)" | grep -q "RUNNABLE" && echo "✓ Cloud SQL: RUNNABLE" || echo "✗ Cloud SQL: NOT READY"
gcloud redis instances describe coditect-redis-staging --region=us-central1 --format="value(state)" | grep -q "READY" && echo "✓ Redis: READY" || echo "✗ Redis: NOT READY"
kubectl get nodes | grep -q "Ready" && echo "✓ GKE: 2 nodes ready" || echo "✗ GKE: Nodes not ready"
echo

echo "[2/5] Checking Kubernetes Resources..."
kubectl get namespace coditect-staging &>/dev/null && echo "✓ Namespace: coditect-staging" || echo "✗ Namespace: missing"
kubectl get secret backend-secrets -n coditect-staging &>/dev/null && echo "✓ Secret: backend-secrets" || echo "✗ Secret: missing"
kubectl get deployment coditect-backend -n coditect-staging &>/dev/null && echo "✓ Deployment: coditect-backend" || echo "✗ Deployment: missing"
echo

echo "[3/5] Checking Pod Status..."
READY=$(kubectl get deployment coditect-backend -n coditect-staging -o jsonpath='{.status.readyReplicas}')
DESIRED=$(kubectl get deployment coditect-backend -n coditect-staging -o jsonpath='{.spec.replicas}')
if [ "$READY" = "$DESIRED" ]; then
echo "✓ Pods: $READY/$DESIRED ready"
else
echo "✗ Pods: $READY/$DESIRED ready (expected $DESIRED)"
fi
echo

echo "[4/5] Checking Service..."
EXTERNAL_IP=$(kubectl get service coditect-backend -n coditect-staging -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
if [ -n "$EXTERNAL_IP" ]; then
echo "✓ LoadBalancer IP: $EXTERNAL_IP"
else
echo "✗ LoadBalancer: No external IP assigned"
exit 1
fi
echo

echo "[5/5] Testing Health Endpoint..."
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://$EXTERNAL_IP/api/v1/health/)
if [ "$HTTP_CODE" = "200" ]; then
echo "✓ Health endpoint: HTTP $HTTP_CODE"
curl -s http://$EXTERNAL_IP/api/v1/health/ | python3 -m json.tool
else
echo "✗ Health endpoint: HTTP $HTTP_CODE (expected 200)"
exit 1
fi
echo

echo "=== Validation Complete: All Checks Passed ==="

Run validation:

chmod +x validate-staging-deployment.sh
./validate-staging-deployment.sh

Key Configuration Files

1. Dockerfile (Multi-Stage Build)

Location: /Dockerfile

Purpose: Production-ready Docker image with security hardening

Key Features:

  • Multi-stage build (builder + runtime)
  • Non-root user (UID 1000)
  • Python 3.12.12 for protobuf compatibility
  • Minimal runtime dependencies
  • Gunicorn WSGI server (4 workers)
  • Health check endpoint

Security Hardening:

  • Runs as non-root user django
  • Read-only root filesystem capability
  • Drops all Linux capabilities
  • No privilege escalation

Build Command:

docker buildx build --platform linux/amd64 \
-t us-central1-docker.pkg.dev/$PROJECT_ID/coditect-backend/coditect-cloud-backend:v1.0.0-staging \
--push .

2. backend-deployment.yaml

Location: /deployment/kubernetes/staging/backend-deployment.yaml

Purpose: Kubernetes Deployment manifest for Django backend

Key Configuration:

SettingValuePurpose
Replicas2High availability
StrategyRollingUpdateZero-downtime deployments
Max Surge1One extra pod during updates
Max Unavailable0Always maintain 2 pods
Image Pull PolicyAlwaysAlways pull latest staging tag

Resource Limits:

ResourceRequestLimitNotes
Memory256Mi512MiStaging minimal sizing
CPU250m500m0.25-0.5 vCPU per pod

Health Probes:

# Liveness: Is the container alive?
livenessProbe:
httpGet:
path: /api/v1/health/live
port: 8000
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3

# Readiness: Can the container serve traffic?
readinessProbe:
httpGet:
path: /api/v1/health/ready
port: 8000
scheme: HTTP
initialDelaySeconds: 20
periodSeconds: 5
failureThreshold: 3

Environment Variables:

  • All sensitive data loaded from backend-secrets Secret
  • DJANGO_SETTINGS_MODULE=license_platform.settings.production
  • ENVIRONMENT=staging

Security Context:

  • Runs as non-root (UID 1000)
  • Drops all capabilities
  • No privilege escalation

Anti-Affinity:

  • Prefers scheduling pods on different nodes
  • Improves availability during node failures

3. backend-config.yaml

Location: /deployment/kubernetes/staging/backend-config.yaml

Purpose: ConfigMap for non-sensitive application configuration

Key Configuration:

data:
DJANGO_ALLOWED_HOSTS: "10.56.0.0/16,coditect-backend.coditect-staging.svc.cluster.local,*.coditect.com,localhost"

ALLOWED_HOSTS Breakdown:

  • 10.56.0.0/16 - GKE pod network (health probes from kubelet)
  • coditect-backend.coditect-staging.svc.cluster.local - Internal service DNS
  • *.coditect.com - Production domains
  • localhost - Local development

Why This Matters: Django rejects requests with Host headers not in ALLOWED_HOSTS. This configuration enables:

  • Kubernetes health probes to succeed
  • Internal service-to-service calls
  • External LoadBalancer traffic
  • Local development/testing

4. backend-service.yaml

Location: /deployment/kubernetes/staging/backend-service.yaml

Purpose: Exposes backend pods via LoadBalancer

Services Created:

1. External LoadBalancer (coditect-backend):

spec:
type: LoadBalancer
ports:
- name: http
port: 80
targetPort: 8000
- name: https
port: 443
targetPort: 8000

Purpose: Public internet access to backend API

2. Internal ClusterIP (coditect-backend-internal):

spec:
type: ClusterIP
ports:
- name: http
port: 8000
targetPort: 8000

Purpose: Internal cluster communication (frontend → backend)

Annotations:

  • cloud.google.com/neg: '{"ingress": true}' - Network Endpoint Group (for future Ingress)
  • cloud.google.com/backend-config - Backend configuration (for future health checks)

5. migrate-job.yaml

Location: /deployment/kubernetes/staging/migrate-job.yaml

Purpose: Run Django database migrations as Kubernetes Job

Key Configuration:

spec:
ttlSecondsAfterFinished: 300 # Auto-delete after 5 minutes
template:
spec:
restartPolicy: OnFailure
containers:
- name: migrate
command:
- python
- manage.py
- migrate
- --noinput

Job Characteristics:

  • Runs once per apply
  • Restarts on failure (up to 6 retries default)
  • Auto-deletes 5 minutes after completion
  • Uses same Docker image as deployment
  • Accesses same database via secrets

Usage:

# Apply migration job
kubectl apply -f deployment/kubernetes/staging/migrate-job.yaml

# Watch progress
kubectl logs -n coditect-staging job/django-migrate --follow

# Check completion
kubectl get job django-migrate -n coditect-staging

Important: Always run migrations before deploying updated backend code that changes models.


6. namespace.yaml

Location: /deployment/kubernetes/staging/namespace.yaml

Purpose: Create isolated namespace for staging environment

Configuration:

apiVersion: v1
kind: Namespace
metadata:
name: coditect-staging
labels:
environment: staging
team: backend
project: coditect-cloud-backend
managed-by: kubectl

Labels Enable:

  • Resource organization and filtering
  • Network policies (future)
  • Cost allocation and tracking
  • RBAC policies (future)

Environment-Specific Notes

Staging vs. Production Differences

AspectStagingProduction
SSL/TLSDisabled (HTTP only)Required (HTTPS enforced)
ALLOWED_HOSTSWildcard *.coditect.comSpecific domains only
Replicas2 pods3+ pods (auto-scaling)
ResourcesMinimal (256Mi/250m)Production-sized (1Gi/1000m)
Databasedb-f1-microdb-n1-standard-2+
Redis1GB5GB+
BackupsDailyHourly with point-in-time recovery
MonitoringBasic logsFull observability stack
SecretsManual kubectlGCP Secret Manager automation
DomainIP address or staging subdomainCustom domain with SSL

Staging Environment Characteristics

Purpose:

  • Integration testing
  • Pre-production validation
  • Customer demos
  • Development team testing

Not Suitable For:

  • Customer production workloads
  • Performance/load testing (under-resourced)
  • Compliance requirements (no SSL)
  • Long-term data storage (db-f1-micro limitations)

Cost Optimization:

  • Minimal instance sizes
  • No Cloud Armor (DDoS protection)
  • No CDN
  • Basic monitoring only

Estimated Monthly Cost:

  • GKE: ~$150 (2 n1-standard-2 nodes)
  • Cloud SQL: ~$25 (db-f1-micro)
  • Redis: ~$30 (1GB)
  • Networking: ~$20
  • Total: ~$225/month

Troubleshooting

For detailed troubleshooting of common deployment issues, see:

staging-troubleshooting-guide.md

This guide covers:

  1. GCR Deprecation (403 Forbidden) - Artifact Registry migration
  2. Multi-Platform Docker Build - Apple Silicon compatibility
  3. Dockerfile User Permissions - Non-root user setup
  4. Cloud SQL SSL Certificate - Private IP networking
  5. Database User Authentication - User creation and permissions
  6. Django ALLOWED_HOSTS Rejection - Host header validation
  7. Health Probe HTTPS/HTTP Mismatch - Probe configuration

Quick Diagnostics

Pod not starting:

# Describe pod for events
kubectl describe pod -n coditect-staging -l app=coditect-backend

# Check logs
kubectl logs -n coditect-staging -l app=coditect-backend --tail=100

Database connection issues:

# Verify database IP
gcloud sql instances describe coditect-db --format="value(ipAddresses[0].ipAddress)"

# Verify secret contains correct IP
kubectl get secret backend-secrets -n coditect-staging -o jsonpath='{.data.db-host}' | base64 -d

Health probe failures:

# Check probe configuration
kubectl get deployment coditect-backend -n coditect-staging -o yaml | grep -A 10 "livenessProbe"

# Test health endpoint manually
kubectl exec -n coditect-staging -it deploy/coditect-backend -- curl -v http://localhost:8000/api/v1/health/live

Image pull failures:

# Check image path
kubectl get deployment coditect-backend -n coditect-staging -o jsonpath='{.spec.template.spec.containers[0].image}'

# Verify image exists
gcloud artifacts docker images list us-central1-docker.pkg.dev/$PROJECT_ID/coditect-backend/coditect-cloud-backend

Next Steps

Post-Deployment Tasks

  1. Configure DNS (if using custom domain):

    # Get LoadBalancer IP
    EXTERNAL_IP=$(kubectl get service coditect-backend -n coditect-staging -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

    # Create DNS A record
    # staging-api.coditect.com -> $EXTERNAL_IP
  2. Enable SSL/TLS (for production):

    • Create SSL certificate (Let's Encrypt or managed certificate)
    • Configure Ingress with TLS termination
    • Update ALLOWED_HOSTS to enforce HTTPS
  3. Set up monitoring:

    • Enable GKE monitoring and logging
    • Configure alerting for pod failures
    • Set up uptime checks for health endpoint
  4. Create superuser (for Django admin):

    kubectl exec -n coditect-staging -it deploy/coditect-backend -- python manage.py createsuperuser
  5. Test API endpoints:

    # Get external IP
    EXTERNAL_IP=$(kubectl get service coditect-backend -n coditect-staging -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

    # Test endpoints
    curl http://$EXTERNAL_IP/api/v1/health/
    curl http://$EXTERNAL_IP/api/v1/docs/ # Swagger UI

Production Deployment

When ready for production:

  1. Review production differences in Environment-Specific Notes
  2. Update resource limits to production values
  3. Enable SSL/TLS with managed certificates
  4. Configure auto-scaling (HPA and cluster autoscaler)
  5. Set up backup automation (Cloud SQL and Redis)
  6. Implement secrets automation (GCP Secret Manager)
  7. Enable monitoring stack (Prometheus, Grafana, Loki)
  8. Configure CDN for static assets
  9. Enable Cloud Armor for DDoS protection
  10. Perform load testing before launch

OpenTofu Migration (Planned)

Manual infrastructure steps will be replaced with Infrastructure as Code:

  • Repository: coditect-cloud-infra
  • Tool: OpenTofu (Terraform fork)
  • Scope: GKE, Cloud SQL, Redis, networking, IAM
  • Status: Planned (see project-plan.md)

Once implemented, deployment will become:

cd ../coditect-cloud-infra
tofu init
tofu plan -var-file=staging.tfvars
tofu apply -var-file=staging.tfvars

Additional Resources

Documentation

  • coditect-cloud-infra - Infrastructure as Code (OpenTofu)
  • coditect-cloud-frontend - Admin dashboard (consumes this API)
  • coditect-cloud-ide - Browser IDE (uses this for projects)
  • coditect-ops-license - License validation server

Google Cloud Documentation

Support


Document Status: Production Last Updated: December 1, 2025 Maintainer: DevOps Team Version: 1.0.0