Starter Configuration - Quick Deployment Guide

Target: 10-20 concurrent users Resources: 4 vCPU, 8 GB RAM per pod Storage: 100 GB workspace + 10 GB config per pod Cost: ~$400-600/month

Git Submodules Status

All 6 submodules are initialized and active:

Submodule	Location	Purpose	Status
agents-reference	`.claude/agents-reference/`	84 agent templates	✅ Active
commands-reference	`.claude/commands-reference/`	42 command tools	✅ Active
skills-anthropic	`.claude/skills-anthropic/`	Anthropic skills format	✅ Active
sub-agent-collective	`.claude/sub-agent-collective/`	Sub-agent templates	✅ Active
agents-research-plan-code	`archive/agents-research-plan-code/`	Multi-llm research	✅ Active
coditect-v4	`archive/coditect-v4/`	V4 reference (19 FDB models, 88 ADRs)	✅ Active

Quick Start (2 Commands)

# Step 1: Pre-flight check (verify cluster capacity)
./scripts/preflight-check-starter.sh

# Step 2: Deploy Starter configuration
./scripts/deploy-starter-config.sh

That's it! The scripts handle everything automatically.

What Gets Deployed

StatefulSet Changes

Component	Before (Minimal)	After (Starter)	Change
Replicas	3 fixed	10-30 autoscale	+233%
CPU per pod	500m-2000m	2000m-4000m	+100%
RAM per pod	512Mi-2Gi	4Gi-8Gi	+300%
workspace	50Gi	100Gi	+100%
Config	5Gi	10Gi	+100%
User capacity	3 users	10-20 users	+567%

New Components

HorizontalPodAutoscaler (HPA):

Monitors CPU (target: 70%) and RAM (target: 75%)
Scales pods: 10 min → 30 max
Scale up: Immediate (100%/min or +5 pods/min)
Scale down: Gradual (50%/min, 5 min stabilization)

Pre-Flight Check

The preflight-check-starter.sh script verifies:

✅ Cluster connectivity - Can connect to GKE
✅ Cluster capacity - Minimum 40 vCPU, 80 GB RAM
✅ Storage class - standard-rwo available
✅ Persistent disk quota - 1,100 GB available (10 pods × 110 GB)
✅ Required files - All YAML configs present
✅ Current deployment - Shows upgrade path

Example output:

✅ Cluster ready for Starter deployment (with autoscaling)
   Capacity: 60 vCPU, 120 GB RAM

Ready to deploy! Run:
  ./scripts/deploy-starter-config.sh

Deployment Script

The deploy-starter-config.sh script:

Checks cluster access - Verifies kubectl connectivity
Checks cluster capacity - Shows current node resources
Checks namespace - Creates coditect-app if needed
Applies BackendConfig - Session affinity configuration
Checks current deployment - Shows upgrade path, asks confirmation
Applies StatefulSet - Deploys Starter configuration with autoscaling
Applies Ingress - Updates WebSocket and session affinity
Verifies deployment - Shows pod, PVC, and HPA status

Rolling update:

Pods restart one at a time (graceful 120s shutdown)
Users on restarting pods briefly disconnected (~30s)
Session affinity routes users back to same pod after restart
Total deployment time: 10-20 minutes (depending on cluster)

Cluster Requirements

Minimum (10 pods, no autoscaling)

Nodes: 3-5 nodes
Machine type: e2-standard-8 (8 vCPU, 32 GB RAM)
Total: 40 vCPU, 80 GB RAM
Cost: ~$400/month

Recommended (30 pods with autoscaling)

Nodes: 8-10 nodes
Machine type: e2-standard-8 or e2-standard-16
Total: 60-120 vCPU, 120-240 GB RAM
Cost: ~$500-700/month

Check Current Cluster

# List nodes
kubectl get nodes

# Check node capacity
kubectl top nodes

# Check machine types
gcloud container node-pools describe default-pool \
  --cluster=codi-poc-e2-cluster --zone=us-central1-a

Scale Cluster (if needed)

# Increase node count
gcloud container clusters resize codi-poc-e2-cluster \
  --num-nodes=8 --zone=us-central1-a

# Or add new node pool with larger machines
gcloud container node-pools create starter-pool \
  --cluster=codi-poc-e2-cluster --zone=us-central1-a \
  --machine-type=e2-standard-16 --num-nodes=5

Monitoring After Deployment

Check Pod Status

# Watch pods come online
kubectl get pods -n coditect-app -l app=coditect-combined --watch

# Expected output:
# coditect-combined-0   1/1   Running   0   5m
# coditect-combined-1   1/1   Running   0   4m
# ...
# coditect-combined-9   1/1   Running   0   2m

Check Autoscaling

# Watch HPA status
kubectl get hpa -n coditect-app --watch

# Expected output:
# NAME                     REFERENCE                       TARGETS         MINPODS   MAXPODS   REPLICAS
# coditect-combined-hpa    StatefulSet/coditect-combined   15%/70%, 20%/75%   10        30        10

Check Resource Usage

# Current CPU/RAM usage
kubectl top pods -n coditect-app

# Expected per pod:
# NAME                  CPU(cores)   MEMORY(bytes)
# coditect-combined-0   500m         2Gi
# coditect-combined-1   450m         1.8Gi

Check PVCs

# List all persistent volume claims
kubectl get pvc -n coditect-app

# Expected: 20 PVCs (10 pods × 2 volumes each)
# workspace-coditect-combined-0   Bound   100Gi
# theia-config-coditect-combined-0   Bound   10Gi
# ...

Testing

Test 1: Persistence (Critical)

./scripts/test-persistence.sh

Expected: ✅ File survives pod restart

Test 2: Session Affinity

./scripts/test-session-affinity.sh

Expected: ✅ Users route to same pod consistently

Test 3: Autoscaling Behavior

Simulate load:

# Generate CPU load on pod
kubectl exec -n coditect-app coditect-combined-0 -- \
  sh -c 'while true; do :; done' &

# Watch HPA scale up
kubectl get hpa -n coditect-app --watch

Expected: After 1-2 minutes, HPA creates new pods

Stop load:

# Kill background process
kubectl exec -n coditect-app coditect-combined-0 -- killall sh

# Watch HPA scale down (after 5 min stabilization)
kubectl get hpa -n coditect-app --watch

Test 4: Multi-User Scenario

User 1 (Browser 1):

Visit https://coditect.ai/theia
Create /workspace/user1-test.txt
Note pod name from logs
Logout → Login
✅ Same pod + file still exists

User 2 (Browser 2, different IP):

Visit https://coditect.ai/theia
Create /workspace/user2-test.txt
Note pod name (may differ from User 1)
✅ User 1's file NOT visible (separate workspaces)

Troubleshooting

Pods Stuck in Pending

Cause: Insufficient cluster capacity

Check:

kubectl describe pod coditect-combined-7 -n coditect-app
# Look for: "0/X nodes are available: insufficient cpu/memory"

Solution:

# Scale cluster
gcloud container clusters resize codi-poc-e2-cluster \
  --num-nodes=8 --zone=us-central1-a

PVC Provisioning Slow

Cause: GCE Persistent Disk provisioning delay (normal)

Check:

kubectl get events -n coditect-app --sort-by='.lastTimestamp' | grep -i provision

Solution: Wait 2-5 minutes for automatic provisioning

Autoscaling Not Working

Cause: metrics-server not installed

Check:

kubectl top nodes
# If error: metrics-server not found

Solution:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

High CPU/RAM Usage

Check actual vs configured:

kubectl top pods -n coditect-app

If consistently > 80%:

Either: Increase pod resources (4 vCPU → 6 vCPU)
Or: Let autoscaling add more pods (distributes load)

Pod Crashes / OOMKilled

Check logs:

kubectl logs coditect-combined-3 -n coditect-app --previous

If OOMKilled:

Increase memory limits: 8Gi → 12Gi
Or reduce NODE_OPTIONS heap size

Rollback

If issues occur, rollback to minimal (3-pod) configuration:

# Apply original StatefulSet
kubectl apply -f k8s/theia-statefulset.yaml

# Delete HPA
kubectl delete hpa coditect-combined-hpa -n coditect-app

# Verify rollback
kubectl get statefulset coditect-combined -n coditect-app
# Should show 3 replicas

Note: PVCs for pods 3-9 will remain. Delete manually if needed:

kubectl delete pvc -n coditect-app -l app=coditect-combined

Cost Tracking

Current Deployment

# Check actual resource usage
kubectl top pods -n coditect-app

# Estimate cost
# 10 pods × (4 vCPU, 8 GB RAM) × $0.04/vCPU-hour × 730 hours/month
# = 40 vCPU × $29.20/month = ~$1,168/month (compute only)
#
# Storage: 10 pods × 110 GB × $0.17/GB-month = ~$187/month
# Load Balancer: ~$20/month
#
# Total: ~$1,375/month (if all 10 pods running 24/7)
#
# With autoscaling (10 min, 20 avg, 30 max):
# Average: ~$400-600/month

Optimize Costs

Right-size resources - Reduce if usage < 50%
Adjust autoscaling - Lower max replicas if not needed
Use preemptible nodes - 60-80% savings (dev/staging only)

Next Steps After Deployment

Monitor for 24 hours - Check autoscaling behavior
Optimize resources - Adjust based on actual usage
Set up alerts - Configure Prometheus/Grafana
Load testing - Simulate 20 concurrent users
Document runbooks - Incident response procedures

Capacity Planning: docs/11-analysis/gke-capacity-planning.md
StatefulSet Migration: docs/11-analysis/STATEFULSET-migration-guide.md
Terraform Setup: terraform/environments/prod/README.md

Quick Reference

# Deploy Starter config
./scripts/preflight-check-starter.sh
./scripts/deploy-starter-config.sh

# Monitor
kubectl get pods -n coditect-app -l app=coditect-combined --watch
kubectl get hpa -n coditect-app --watch
kubectl top pods -n coditect-app

# Test
./scripts/test-persistence.sh
./scripts/test-session-affinity.sh

# Logs
kubectl logs -f coditect-combined-0 -n coditect-app

# Scale manually (if needed)
kubectl scale statefulset coditect-combined -n coditect-app --replicas=15

# Rollback
kubectl apply -f k8s/theia-statefulset.yaml
kubectl delete hpa coditect-combined-hpa -n coditect-app

Ready to deploy?

./scripts/preflight-check-starter.sh

Git Submodules Status​

Quick Start (2 Commands)​

What Gets Deployed​

StatefulSet Changes​

New Components​

Pre-Flight Check​

Deployment Script​

Cluster Requirements​

Minimum (10 pods, no autoscaling)​

Recommended (30 pods with autoscaling)​

Check Current Cluster​

Scale Cluster (if needed)​

Monitoring After Deployment​

Check Pod Status​

Check Autoscaling​

Check Resource Usage​

Check PVCs​

Testing​

Test 1: Persistence (Critical)​

Test 2: Session Affinity​

Test 3: Autoscaling Behavior​

Test 4: Multi-User Scenario​

Troubleshooting​

Pods Stuck in Pending​

PVC Provisioning Slow​

Autoscaling Not Working​

High CPU/RAM Usage​

Pod Crashes / OOMKilled​

Rollback​

Cost Tracking​

Current Deployment​

Optimize Costs​

Next Steps After Deployment​

Related Documentation​

Quick Reference​

Git Submodules Status

Quick Start (2 Commands)

What Gets Deployed

StatefulSet Changes

New Components

Pre-Flight Check

Deployment Script

Cluster Requirements

Minimum (10 pods, no autoscaling)

Recommended (30 pods with autoscaling)

Check Current Cluster

Scale Cluster (if needed)

Monitoring After Deployment

Check Pod Status

Check Autoscaling

Check Resource Usage

Check PVCs

Testing

Test 1: Persistence (Critical)

Test 2: Session Affinity

Test 3: Autoscaling Behavior

Test 4: Multi-User Scenario

Troubleshooting

Pods Stuck in Pending

PVC Provisioning Slow

Autoscaling Not Working

High CPU/RAM Usage

Pod Crashes / OOMKilled

Rollback

Cost Tracking

Current Deployment

Optimize Costs

Next Steps After Deployment

Related Documentation

Quick Reference