Skip to main content

Kubernetes + Terraform + Helm Quick Start Guide

⏱️ Time to Deploy: 15 minutes | 💰 Cost: ~$200/month | 🎯 Difficulty: Intermediate

This guide gets you from zero to production-ready Coditect V5 infrastructure in 3 simple phases.


📋 Prerequisites Checklist

Before starting, ensure you have:

  • Google Cloud Account with billing enabled
  • Project Created (e.g., serene-voltage-464305-n2)
  • gcloud CLI installed and authenticated
  • Terraform >= 1.5.0 installed
  • kubectl installed
  • terminal access (bash/zsh)
  • 15 minutes of focused time

🚀 Phase 1: Infrastructure Setup (Terraform)

Duration: 12 minutes | What you'll deploy: VPC, GKE cluster, FoundationDB, API v5

Step 1.1: Enable GCP APIs (1 minute)

# Set your project ID
export PROJECT_ID="serene-voltage-464305-n2"
gcloud config set project $PROJECT_ID

# Enable required APIs
gcloud services enable \
compute.googleapis.com \
container.googleapis.com \
artifactregistry.googleapis.com \
cloudbuild.googleapis.com \
logging.googleapis.com \
monitoring.googleapis.com

# Verify APIs are enabled
gcloud services list --enabled | grep -E 'compute|container|artifact'

Expected Output:

compute.googleapis.com
container.googleapis.com
artifactregistry.googleapis.com

Step 1.2: Configure Terraform (2 minutes)

# Navigate to Terraform directory
cd /workspace/PROJECTS/t2/infrastructure/terraform

# Copy example configuration
cp terraform.tfvars.example terraform.tfvars

# Generate JWT secret
JWT_SECRET=$(openssl rand -base64 32)
echo "Generated JWT Secret: $JWT_SECRET"

# Update terraform.tfvars with your values
cat > terraform.tfvars << EOF
# Project Configuration
project_id = "$PROJECT_ID"
region = "us-central1"
zone = "us-central1-a"

# Network Configuration
network_name = "coditect-vpc"
subnet_cidr_range = "10.128.0.0/20"
pods_cidr_range = "10.4.0.0/14"
services_cidr_range = "10.0.32.0/20"
allowed_ip_ranges = ["$(curl -s ifconfig.me)/32"] # Your IP only

# GKE Configuration
cluster_name = "codi-poc-e2-cluster"
node_pool_config = {
name = "default-pool"
machine_type = "e2-medium"
disk_size_gb = 50
disk_type = "pd-standard"
initial_node_count = 3
min_node_count = 1
max_node_count = 10
preemptible = false
}

# FoundationDB Configuration
fdb_namespace = "foundationdb"
fdb_cluster_name = "fdb-cluster"
fdb_replicas = 3
fdb_storage_class = "standard-rwo"
fdb_storage_size = "10Gi"
fdb_cpu_request = "500m"
fdb_memory_request = "2Gi"
fdb_cpu_limit = "2000m"
fdb_memory_limit = "4Gi"

# API Configuration
api_namespace = "coditect-app"
api_deployment_name = "coditect-api-v5"
api_replicas = 3
image_registry = "us-central1-docker.pkg.dev/$PROJECT_ID/coditect"
api_image_tag = "latest"
jwt_secret = "$JWT_SECRET"
api_service_type = "LoadBalancer"
api_service_port = 80
api_cpu_request = "100m"
api_memory_request = "256Mi"
api_cpu_limit = "1000m"
api_memory_limit = "512Mi"

# Domain Configuration
domains = ["coditect.ai", "www.coditect.ai"]

# Labels
labels = {
environment = "production"
project = "coditect-v5"
managed_by = "terraform"
}
EOF

echo "✅ Configuration complete!"

Step 1.3: Initialize Terraform (1 minute)

# Initialize Terraform (downloads providers)
terraform init

# Expected output:
# Terraform has been successfully initialized!

What this does:

  • Downloads Google Cloud provider plugins
  • Initializes backend (local state)
  • Validates module structure

Step 1.4: Plan Deployment (2 minutes)

# Create execution plan
terraform plan -out=tfplan

# Review output - you should see:
# - ~35-40 resources to be created
# - VPC network, subnet, firewall rules
# - GKE cluster and node pool
# - FoundationDB StatefulSet (3 pods)
# - API deployment (3 pods)
# - Services, ConfigMaps, Secrets

Expected Resources:

Plan: 38 to add, 0 to change, 0 to destroy.

⚠️ IMPORTANT: Review the plan carefully. If you see to destroy or unexpected changes, STOP and investigate.

Step 1.5: Apply Infrastructure (6 minutes)

# Apply the plan
terraform apply tfplan

# This will take 10-12 minutes
# Progress indicators:
# [1/38] Creating VPC network...
# [5/38] Creating GKE cluster...
# [20/38] Creating FoundationDB pods...
# [30/38] Creating API deployment...
# [38/38] Complete!

⏳ Grab coffee while this runs (~10 minutes)

Expected Final Output:

Apply complete! Resources: 38 added, 0 changed, 0 destroyed.

Outputs:

api_service_ip = "34.123.45.67"
cluster_endpoint = "https://35.223.45.78"
cluster_name = "codi-poc-e2-cluster"
connection_info = {
"api_url" = "http://34.123.45.67/api/v5"
"cluster_name" = "codi-poc-e2-cluster"
"fdb_coordinator" = "10.128.0.8:4500"
"region" = "us-central1"
}
fdb_cluster_ip = "10.128.0.8"
kubectl_config_command = "gcloud container clusters get-credentials codi-poc-e2-cluster --region us-central1 --project serene-voltage-464305-n2"
load_balancer_ip = "34.123.45.67"

🔍 Phase 2: Verification (Kubernetes)

Duration: 2 minutes | What you'll verify: Cluster, pods, services

Step 2.1: Configure kubectl (30 seconds)

# Get cluster credentials
terraform output -raw kubectl_config_command | bash

# Verify connection
kubectl get nodes

# Expected output:
# NAME STATUS ROLES AGE version
# gke-codi-poc-e2-cluster-default-pool-... Ready <none> 5m30s v1.28.3-gke.1203000
# gke-codi-poc-e2-cluster-default-pool-... Ready <none> 5m28s v1.28.3-gke.1203000
# gke-codi-poc-e2-cluster-default-pool-... Ready <none> 5m29s v1.28.3-gke.1203000

Step 2.2: Check FoundationDB (30 seconds)

# Check FDB pods
kubectl get pods -n foundationdb

# Expected output:
# NAME READY STATUS RESTARTS AGE
# fdb-cluster-0 1/1 Running 0 3m
# fdb-cluster-1 1/1 Running 0 3m
# fdb-cluster-2 1/1 Running 0 3m

# Verify FDB cluster status
kubectl exec -n foundationdb fdb-cluster-0 -- fdbcli --exec "status"

# Expected output (key lines):
# Replication health: Healthy
# Storage server count: 3

✅ SUCCESS INDICATOR: Replication health: Healthy

Step 2.3: Check API Deployment (30 seconds)

# Check API pods
kubectl get pods -n coditect-app

# Expected output:
# NAME READY STATUS RESTARTS AGE
# coditect-api-v5-xxxxxxxxxx-xxxxx 1/1 Running 0 2m
# coditect-api-v5-xxxxxxxxxx-xxxxx 1/1 Running 0 2m
# coditect-api-v5-xxxxxxxxxx-xxxxx 1/1 Running 0 2m

# Check API service
kubectl get svc -n coditect-app

# Expected output:
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# coditect-api-v5 LoadBalancer 10.0.40.123 34.123.45.67 80:30123/TCP 2m

✅ SUCCESS INDICATOR: All pods 1/1 Running, service has EXTERNAL-IP

Step 2.4: Test API Health (30 seconds)

# Get API IP
API_IP=$(terraform output -raw api_service_ip)

# Test health endpoint
curl http://$API_IP/api/v5/health

# Expected output:
# {"success":true,"data":{"service":"coditect-v5-api","status":"healthy"}}

# Test ready endpoint
curl http://$API_IP/api/v5/ready

# Expected output:
# {"success":true,"data":{"service":"coditect-v5-api","status":"ready","database":"connected"}}

✅ SUCCESS INDICATOR: Both endpoints return "success":true


📦 Phase 3: Helm Charts (Future Enhancement)

Status: 🚧 Not implemented yet (planned for Phase 3)

What Helm will provide:

  • Better application packaging
  • Versioned releases
  • Easier rollbacks
  • Template-based configuration

Preview of future Helm workflow:

# Future: Deploy with Helm instead of Terraform Kubernetes provider
helm install coditect-api-v5 ./helm/coditect-api-v5 \
--namespace coditect-app \
--set image.tag=v1.2.3 \
--set jwt.secret=$JWT_SECRET

# Future: Upgrade
helm upgrade coditect-api-v5 ./helm/coditect-api-v5 \
--set image.tag=v1.2.4

# Future: Rollback
helm rollback coditect-api-v5 1

When to implement: After production deployment is stable


📊 Quick Reference

Important URLs and IPs

# Get all connection info
terraform output connection_info

# Get specific values
terraform output api_service_ip # API external IP
terraform output cluster_endpoint # GKE master endpoint
terraform output fdb_cluster_ip # FoundationDB internal IP

Useful Commands

# View logs
kubectl logs -n coditect-app -l app=coditect-api-v5 --tail=50 -f

# Scale API
kubectl scale deployment -n coditect-app coditect-api-v5 --replicas=5

# Restart API pods
kubectl rollout restart deployment -n coditect-app coditect-api-v5

# Check HPA status
kubectl get hpa -n coditect-app

# Check pod resources
kubectl top pods -n coditect-app
kubectl top pods -n foundationdb

Terraform Commands

# View all outputs
terraform output

# Refresh outputs (if changed manually)
terraform refresh

# Check for drift
terraform plan

# Apply changes
terraform apply

# Destroy everything (careful!)
terraform destroy

🐛 Troubleshooting

Issue 1: API Pods in CrashLoopBackOff

Symptom:

kubectl get pods -n coditect-app
# coditect-api-v5-xxx 0/1 CrashLoopBackOff 3 2m

Solution:

# Check logs
kubectl logs -n coditect-app <pod-name>

# Common causes:
# 1. FDB not ready - Wait 2-3 minutes for FDB cluster
# 2. Wrong FDB cluster file - Check ConfigMap
kubectl get configmap -n coditect-app coditect-api-v5-fdb-config -o yaml

# 3. JWT secret missing - Check Secret
kubectl get secret -n coditect-app coditect-api-v5-jwt -o yaml

Issue 2: LoadBalancer Has No External IP

Symptom:

kubectl get svc -n coditect-app
# coditect-api-v5 LoadBalancer 10.0.40.123 <pending> 80:30123/TCP 5m

Solution:

# Wait up to 5 minutes for IP allocation
kubectl get svc -n coditect-app coditect-api-v5 --watch

# Check events
kubectl describe svc -n coditect-app coditect-api-v5

# Verify quota (if stuck on <pending>)
gcloud compute project-info describe --project $PROJECT_ID | grep -A 5 EXTERNAL

Issue 3: FoundationDB "Replication Unhealthy"

Symptom:

kubectl exec -n foundationdb fdb-cluster-0 -- fdbcli --exec "status"
# Replication health: UNHEALTHY

Solution:

# Check all FDB pods are running
kubectl get pods -n foundationdb

# If pod is stuck in Pending:
kubectl describe pod -n foundationdb fdb-cluster-0

# Common cause: PVC not bound
kubectl get pvc -n foundationdb

# Wait 2-3 minutes for cluster to stabilize
# FDB cluster initialization takes time

Issue 4: Terraform Apply Fails

Symptom:

Error: Error creating Network: googleapi: Error 409: Already exists

Solution:

# Option 1: Import existing resource
terraform import module.networking.google_compute_network.vpc \
projects/$PROJECT_ID/global/networks/coditect-vpc

# Option 2: Destroy and recreate
terraform destroy -target=module.networking
terraform apply

# Option 3: Start fresh (nuclear option)
terraform destroy
terraform apply

Issue 5: kubectl Can't Connect to Cluster

Symptom:

Unable to connect to the server: dial tcp: lookup xxx on 8.8.8.8:53: no such host

Solution:

# Re-authenticate
gcloud auth login
gcloud config set project $PROJECT_ID

# Get credentials again
gcloud container clusters get-credentials codi-poc-e2-cluster \
--region us-central1 \
--project $PROJECT_ID

# Verify
kubectl cluster-info

🎯 Next Steps

Immediate (Post-Deployment)

  1. Configure DNS (if using custom domain):

    # Get LoadBalancer IP
    terraform output load_balancer_ip

    # Create A record in your DNS provider:
    # coditect.ai -> <LoadBalancer IP>

    # Wait for SSL certificate to provision (~15 minutes)
    kubectl get managedcertificate -n coditect-app
  2. Set up Monitoring:

    # View in Cloud Console
    echo "https://console.cloud.google.com/kubernetes/clusters/details/us-central1/codi-poc-e2-cluster?project=$PROJECT_ID"

    # Create dashboard
    echo "https://console.cloud.google.com/monitoring/dashboards?project=$PROJECT_ID"
  3. Test API Endpoints:

    API_IP=$(terraform output -raw api_service_ip)

    # Health check
    curl http://$API_IP/api/v5/health

    # Register user (example)
    curl -X POST http://$API_IP/api/v5/auth/register \
    -H "Content-Type: application/json" \
    -d '{
    "email": "test@example.com",
    "password": "SecurePass123!",
    "first_name": "Test",
    "last_name": "User"
    }'

Short-Term (Week 1)

  1. Set Up Remote State:

    # Create GCS bucket for Terraform state
    gsutil mb gs://$PROJECT_ID-terraform-state
    gsutil versioning set on gs://$PROJECT_ID-terraform-state

    # Update terraform/main.tf backend configuration
    # Uncomment backend "gcs" block

    # Migrate state
    terraform init -migrate-state
  2. Configure Backups:

    • FoundationDB: Set up continuous backup to GCS
    • Terraform state: Already versioned in GCS
    • Configuration: Git repository backups
  3. Set Up Alerts:

    # Create alert policy for pod failures
    gcloud alpha monitoring policies create \
    --notification-channels=<channel-id> \
    --display-name="API Pod Failures" \
    --condition-display-name="Pod crash rate > 2/min"

Medium-Term (Month 1)

  1. Implement GitOps with ArgoCD:

    • Install ArgoCD on cluster
    • Configure application sync
    • Set up GitHub Actions for CI/CD
  2. Migrate to Helm Charts:

    • Create Helm chart for API deployment
    • Replace Terraform Kubernetes provider
    • Implement versioned releases
  3. Multi-Environment Setup:

    • Create environments/dev/
    • Create environments/staging/
    • Establish promotion workflow
  4. Cost Optimization:

    • Review actual resource usage
    • Right-size node pool and pods
    • Apply committed use discounts

💰 Cost Breakdown

Monthly Costs (current configuration):

ResourceCost
GKE Cluster Management~$73/month
3× e2-medium nodes~$67/month
Persistent Disks (30GB)~$12/month
LoadBalancer~$18/month
Logging/Monitoring~$10-20/month
Egress Traffic (est.)~$10-30/month
Total~$190-220/month

Optimization Tips:

  • Apply 1-year committed use: Save ~$25/month (37% discount)
  • Right-size after monitoring: Potential 20-30% savings
  • Use preemptible nodes for dev: Save 60-80% (not for production)


✅ Success Checklist

After completing all phases, verify:

  • All 3 GKE nodes are Ready
  • All 3 FDB pods are 1/1 Running
  • All 3 API pods are 1/1 Running
  • FDB cluster status shows Replication Healthy
  • LoadBalancer has external IP
  • /api/v5/health returns {"success":true}
  • /api/v5/ready returns {"success":true}
  • Terraform state is saved (local or GCS)
  • JWT secret is stored securely
  • Monitoring is configured
  • DNS is pointed to LoadBalancer (if applicable)

If all checked: 🎉 Deployment successful!


🆘 Getting Help

Issues with deployment?

  1. Check troubleshooting section above ⬆️
  2. Review logs:
    kubectl logs -n coditect-app -l app=coditect-api-v5
    kubectl logs -n foundationdb fdb-cluster-0
  3. Check Terraform state:
    terraform plan  # Look for drift
    terraform show # View current state
  4. Consult detailed docs:

Still stuck?

  • Check GitHub issues in project repository
  • Review GCP Console for resource status
  • Verify GCP quotas and billing

⏱️ Total Time: 15 minutes | 🎯 Result: Production-ready Coditect V5 infrastructure

Happy deploying! 🚀