Skip to main content

Starter Configuration - Quick Deployment Guide

Target: 10-20 concurrent users Resources: 4 vCPU, 8 GB RAM per pod Storage: 100 GB workspace + 10 GB config per pod Cost: ~$400-600/month


Git Submodules Status​

All 6 submodules are initialized and active:

SubmoduleLocationPurposeStatus
agents-reference.claude/agents-reference/84 agent templates✅ Active
commands-reference.claude/commands-reference/42 command tools✅ Active
skills-anthropic.claude/skills-anthropic/Anthropic skills format✅ Active
sub-agent-collective.claude/sub-agent-collective/Sub-agent templates✅ Active
agents-research-plan-codearchive/agents-research-plan-code/Multi-llm research✅ Active
coditect-v4archive/coditect-v4/V4 reference (19 FDB models, 88 ADRs)✅ Active

Quick Start (2 Commands)​

# Step 1: Pre-flight check (verify cluster capacity)
./scripts/preflight-check-starter.sh

# Step 2: Deploy Starter configuration
./scripts/deploy-starter-config.sh

That's it! The scripts handle everything automatically.


What Gets Deployed​

StatefulSet Changes​

ComponentBefore (Minimal)After (Starter)Change
Replicas3 fixed10-30 autoscale+233%
CPU per pod500m-2000m2000m-4000m+100%
RAM per pod512Mi-2Gi4Gi-8Gi+300%
workspace50Gi100Gi+100%
Config5Gi10Gi+100%
User capacity3 users10-20 users+567%

New Components​

HorizontalPodAutoscaler (HPA):

  • Monitors CPU (target: 70%) and RAM (target: 75%)
  • Scales pods: 10 min → 30 max
  • Scale up: Immediate (100%/min or +5 pods/min)
  • Scale down: Gradual (50%/min, 5 min stabilization)

Pre-Flight Check​

The preflight-check-starter.sh script verifies:

  1. ✅ Cluster connectivity - Can connect to GKE
  2. ✅ Cluster capacity - Minimum 40 vCPU, 80 GB RAM
  3. ✅ Storage class - standard-rwo available
  4. ✅ Persistent disk quota - 1,100 GB available (10 pods × 110 GB)
  5. ✅ Required files - All YAML configs present
  6. ✅ Current deployment - Shows upgrade path

Example output:

✅ Cluster ready for Starter deployment (with autoscaling)
Capacity: 60 vCPU, 120 GB RAM

Ready to deploy! Run:
./scripts/deploy-starter-config.sh

Deployment Script​

The deploy-starter-config.sh script:

  1. Checks cluster access - Verifies kubectl connectivity
  2. Checks cluster capacity - Shows current node resources
  3. Checks namespace - Creates coditect-app if needed
  4. Applies BackendConfig - Session affinity configuration
  5. Checks current deployment - Shows upgrade path, asks confirmation
  6. Applies StatefulSet - Deploys Starter configuration with autoscaling
  7. Applies Ingress - Updates WebSocket and session affinity
  8. Verifies deployment - Shows pod, PVC, and HPA status

Rolling update:

  • Pods restart one at a time (graceful 120s shutdown)
  • Users on restarting pods briefly disconnected (~30s)
  • Session affinity routes users back to same pod after restart
  • Total deployment time: 10-20 minutes (depending on cluster)

Cluster Requirements​

Minimum (10 pods, no autoscaling)​

  • Nodes: 3-5 nodes
  • Machine type: e2-standard-8 (8 vCPU, 32 GB RAM)
  • Total: 40 vCPU, 80 GB RAM
  • Cost: ~$400/month
  • Nodes: 8-10 nodes
  • Machine type: e2-standard-8 or e2-standard-16
  • Total: 60-120 vCPU, 120-240 GB RAM
  • Cost: ~$500-700/month

Check Current Cluster​

# List nodes
kubectl get nodes

# Check node capacity
kubectl top nodes

# Check machine types
gcloud container node-pools describe default-pool \
--cluster=codi-poc-e2-cluster --zone=us-central1-a

Scale Cluster (if needed)​

# Increase node count
gcloud container clusters resize codi-poc-e2-cluster \
--num-nodes=8 --zone=us-central1-a

# Or add new node pool with larger machines
gcloud container node-pools create starter-pool \
--cluster=codi-poc-e2-cluster --zone=us-central1-a \
--machine-type=e2-standard-16 --num-nodes=5

Monitoring After Deployment​

Check Pod Status​

# Watch pods come online
kubectl get pods -n coditect-app -l app=coditect-combined --watch

# Expected output:
# coditect-combined-0 1/1 Running 0 5m
# coditect-combined-1 1/1 Running 0 4m
# ...
# coditect-combined-9 1/1 Running 0 2m

Check Autoscaling​

# Watch HPA status
kubectl get hpa -n coditect-app --watch

# Expected output:
# NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
# coditect-combined-hpa StatefulSet/coditect-combined 15%/70%, 20%/75% 10 30 10

Check Resource Usage​

# Current CPU/RAM usage
kubectl top pods -n coditect-app

# Expected per pod:
# NAME CPU(cores) MEMORY(bytes)
# coditect-combined-0 500m 2Gi
# coditect-combined-1 450m 1.8Gi

Check PVCs​

# List all persistent volume claims
kubectl get pvc -n coditect-app

# Expected: 20 PVCs (10 pods × 2 volumes each)
# workspace-coditect-combined-0 Bound 100Gi
# theia-config-coditect-combined-0 Bound 10Gi
# ...

Testing​

Test 1: Persistence (Critical)​

./scripts/test-persistence.sh

Expected: ✅ File survives pod restart

Test 2: Session Affinity​

./scripts/test-session-affinity.sh

Expected: ✅ Users route to same pod consistently

Test 3: Autoscaling Behavior​

Simulate load:

# Generate CPU load on pod
kubectl exec -n coditect-app coditect-combined-0 -- \
sh -c 'while true; do :; done' &

# Watch HPA scale up
kubectl get hpa -n coditect-app --watch

Expected: After 1-2 minutes, HPA creates new pods

Stop load:

# Kill background process
kubectl exec -n coditect-app coditect-combined-0 -- killall sh

# Watch HPA scale down (after 5 min stabilization)
kubectl get hpa -n coditect-app --watch

Test 4: Multi-User Scenario​

User 1 (Browser 1):

  1. Visit https://coditect.ai/theia
  2. Create /workspace/user1-test.txt
  3. Note pod name from logs
  4. Logout → Login
  5. ✅ Same pod + file still exists

User 2 (Browser 2, different IP):

  1. Visit https://coditect.ai/theia
  2. Create /workspace/user2-test.txt
  3. Note pod name (may differ from User 1)
  4. ✅ User 1's file NOT visible (separate workspaces)

Troubleshooting​

Pods Stuck in Pending​

Cause: Insufficient cluster capacity

Check:

kubectl describe pod coditect-combined-7 -n coditect-app
# Look for: "0/X nodes are available: insufficient cpu/memory"

Solution:

# Scale cluster
gcloud container clusters resize codi-poc-e2-cluster \
--num-nodes=8 --zone=us-central1-a

PVC Provisioning Slow​

Cause: GCE Persistent Disk provisioning delay (normal)

Check:

kubectl get events -n coditect-app --sort-by='.lastTimestamp' | grep -i provision

Solution: Wait 2-5 minutes for automatic provisioning

Autoscaling Not Working​

Cause: metrics-server not installed

Check:

kubectl top nodes
# If error: metrics-server not found

Solution:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

High CPU/RAM Usage​

Check actual vs configured:

kubectl top pods -n coditect-app

If consistently > 80%:

  • Either: Increase pod resources (4 vCPU → 6 vCPU)
  • Or: Let autoscaling add more pods (distributes load)

Pod Crashes / OOMKilled​

Check logs:

kubectl logs coditect-combined-3 -n coditect-app --previous

If OOMKilled:

  • Increase memory limits: 8Gi → 12Gi
  • Or reduce NODE_OPTIONS heap size

Rollback​

If issues occur, rollback to minimal (3-pod) configuration:

# Apply original StatefulSet
kubectl apply -f k8s/theia-statefulset.yaml

# Delete HPA
kubectl delete hpa coditect-combined-hpa -n coditect-app

# Verify rollback
kubectl get statefulset coditect-combined -n coditect-app
# Should show 3 replicas

Note: PVCs for pods 3-9 will remain. Delete manually if needed:

kubectl delete pvc -n coditect-app -l app=coditect-combined

Cost Tracking​

Current Deployment​

# Check actual resource usage
kubectl top pods -n coditect-app

# Estimate cost
# 10 pods × (4 vCPU, 8 GB RAM) × $0.04/vCPU-hour × 730 hours/month
# = 40 vCPU × $29.20/month = ~$1,168/month (compute only)
#
# Storage: 10 pods × 110 GB × $0.17/GB-month = ~$187/month
# Load Balancer: ~$20/month
#
# Total: ~$1,375/month (if all 10 pods running 24/7)
#
# With autoscaling (10 min, 20 avg, 30 max):
# Average: ~$400-600/month

Optimize Costs​

  1. Right-size resources - Reduce if usage < 50%
  2. Adjust autoscaling - Lower max replicas if not needed
  3. Use preemptible nodes - 60-80% savings (dev/staging only)

Next Steps After Deployment​

  1. Monitor for 24 hours - Check autoscaling behavior
  2. Optimize resources - Adjust based on actual usage
  3. Set up alerts - Configure Prometheus/Grafana
  4. Load testing - Simulate 20 concurrent users
  5. Document runbooks - Incident response procedures

  • Capacity Planning: docs/11-analysis/gke-capacity-planning.md
  • StatefulSet Migration: docs/11-analysis/STATEFULSET-migration-guide.md
  • Terraform Setup: terraform/environments/prod/README.md

Quick Reference​

# Deploy Starter config
./scripts/preflight-check-starter.sh
./scripts/deploy-starter-config.sh

# Monitor
kubectl get pods -n coditect-app -l app=coditect-combined --watch
kubectl get hpa -n coditect-app --watch
kubectl top pods -n coditect-app

# Test
./scripts/test-persistence.sh
./scripts/test-session-affinity.sh

# Logs
kubectl logs -f coditect-combined-0 -n coditect-app

# Scale manually (if needed)
kubectl scale statefulset coditect-combined -n coditect-app --replicas=15

# Rollback
kubectl apply -f k8s/theia-statefulset.yaml
kubectl delete hpa coditect-combined-hpa -n coditect-app

Ready to deploy?

./scripts/preflight-check-starter.sh