Starter Configuration - Quick Deployment Guide
Target: 10-20 concurrent users Resources: 4 vCPU, 8 GB RAM per pod Storage: 100 GB workspace + 10 GB config per pod Cost: ~$400-600/month
Git Submodules Status​
All 6 submodules are initialized and active:
| Submodule | Location | Purpose | Status |
|---|---|---|---|
| agents-reference | .claude/agents-reference/ | 84 agent templates | ✅ Active |
| commands-reference | .claude/commands-reference/ | 42 command tools | ✅ Active |
| skills-anthropic | .claude/skills-anthropic/ | Anthropic skills format | ✅ Active |
| sub-agent-collective | .claude/sub-agent-collective/ | Sub-agent templates | ✅ Active |
| agents-research-plan-code | archive/agents-research-plan-code/ | Multi-llm research | ✅ Active |
| coditect-v4 | archive/coditect-v4/ | V4 reference (19 FDB models, 88 ADRs) | ✅ Active |
Quick Start (2 Commands)​
# Step 1: Pre-flight check (verify cluster capacity)
./scripts/preflight-check-starter.sh
# Step 2: Deploy Starter configuration
./scripts/deploy-starter-config.sh
That's it! The scripts handle everything automatically.
What Gets Deployed​
StatefulSet Changes​
| Component | Before (Minimal) | After (Starter) | Change |
|---|---|---|---|
| Replicas | 3 fixed | 10-30 autoscale | +233% |
| CPU per pod | 500m-2000m | 2000m-4000m | +100% |
| RAM per pod | 512Mi-2Gi | 4Gi-8Gi | +300% |
| workspace | 50Gi | 100Gi | +100% |
| Config | 5Gi | 10Gi | +100% |
| User capacity | 3 users | 10-20 users | +567% |
New Components​
HorizontalPodAutoscaler (HPA):
- Monitors CPU (target: 70%) and RAM (target: 75%)
- Scales pods: 10 min → 30 max
- Scale up: Immediate (100%/min or +5 pods/min)
- Scale down: Gradual (50%/min, 5 min stabilization)
Pre-Flight Check​
The preflight-check-starter.sh script verifies:
- ✅ Cluster connectivity - Can connect to GKE
- ✅ Cluster capacity - Minimum 40 vCPU, 80 GB RAM
- ✅ Storage class -
standard-rwoavailable - ✅ Persistent disk quota - 1,100 GB available (10 pods × 110 GB)
- ✅ Required files - All YAML configs present
- ✅ Current deployment - Shows upgrade path
Example output:
✅ Cluster ready for Starter deployment (with autoscaling)
Capacity: 60 vCPU, 120 GB RAM
Ready to deploy! Run:
./scripts/deploy-starter-config.sh
Deployment Script​
The deploy-starter-config.sh script:
- Checks cluster access - Verifies kubectl connectivity
- Checks cluster capacity - Shows current node resources
- Checks namespace - Creates
coditect-appif needed - Applies BackendConfig - Session affinity configuration
- Checks current deployment - Shows upgrade path, asks confirmation
- Applies StatefulSet - Deploys Starter configuration with autoscaling
- Applies Ingress - Updates WebSocket and session affinity
- Verifies deployment - Shows pod, PVC, and HPA status
Rolling update:
- Pods restart one at a time (graceful 120s shutdown)
- Users on restarting pods briefly disconnected (~30s)
- Session affinity routes users back to same pod after restart
- Total deployment time: 10-20 minutes (depending on cluster)
Cluster Requirements​
Minimum (10 pods, no autoscaling)​
- Nodes: 3-5 nodes
- Machine type:
e2-standard-8(8 vCPU, 32 GB RAM) - Total: 40 vCPU, 80 GB RAM
- Cost: ~$400/month
Recommended (30 pods with autoscaling)​
- Nodes: 8-10 nodes
- Machine type:
e2-standard-8ore2-standard-16 - Total: 60-120 vCPU, 120-240 GB RAM
- Cost: ~$500-700/month
Check Current Cluster​
# List nodes
kubectl get nodes
# Check node capacity
kubectl top nodes
# Check machine types
gcloud container node-pools describe default-pool \
--cluster=codi-poc-e2-cluster --zone=us-central1-a
Scale Cluster (if needed)​
# Increase node count
gcloud container clusters resize codi-poc-e2-cluster \
--num-nodes=8 --zone=us-central1-a
# Or add new node pool with larger machines
gcloud container node-pools create starter-pool \
--cluster=codi-poc-e2-cluster --zone=us-central1-a \
--machine-type=e2-standard-16 --num-nodes=5
Monitoring After Deployment​
Check Pod Status​
# Watch pods come online
kubectl get pods -n coditect-app -l app=coditect-combined --watch
# Expected output:
# coditect-combined-0 1/1 Running 0 5m
# coditect-combined-1 1/1 Running 0 4m
# ...
# coditect-combined-9 1/1 Running 0 2m
Check Autoscaling​
# Watch HPA status
kubectl get hpa -n coditect-app --watch
# Expected output:
# NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
# coditect-combined-hpa StatefulSet/coditect-combined 15%/70%, 20%/75% 10 30 10
Check Resource Usage​
# Current CPU/RAM usage
kubectl top pods -n coditect-app
# Expected per pod:
# NAME CPU(cores) MEMORY(bytes)
# coditect-combined-0 500m 2Gi
# coditect-combined-1 450m 1.8Gi
Check PVCs​
# List all persistent volume claims
kubectl get pvc -n coditect-app
# Expected: 20 PVCs (10 pods × 2 volumes each)
# workspace-coditect-combined-0 Bound 100Gi
# theia-config-coditect-combined-0 Bound 10Gi
# ...
Testing​
Test 1: Persistence (Critical)​
./scripts/test-persistence.sh
Expected: ✅ File survives pod restart
Test 2: Session Affinity​
./scripts/test-session-affinity.sh
Expected: ✅ Users route to same pod consistently
Test 3: Autoscaling Behavior​
Simulate load:
# Generate CPU load on pod
kubectl exec -n coditect-app coditect-combined-0 -- \
sh -c 'while true; do :; done' &
# Watch HPA scale up
kubectl get hpa -n coditect-app --watch
Expected: After 1-2 minutes, HPA creates new pods
Stop load:
# Kill background process
kubectl exec -n coditect-app coditect-combined-0 -- killall sh
# Watch HPA scale down (after 5 min stabilization)
kubectl get hpa -n coditect-app --watch
Test 4: Multi-User Scenario​
User 1 (Browser 1):
- Visit https://coditect.ai/theia
- Create
/workspace/user1-test.txt - Note pod name from logs
- Logout → Login
- ✅ Same pod + file still exists
User 2 (Browser 2, different IP):
- Visit https://coditect.ai/theia
- Create
/workspace/user2-test.txt - Note pod name (may differ from User 1)
- ✅ User 1's file NOT visible (separate workspaces)
Troubleshooting​
Pods Stuck in Pending​
Cause: Insufficient cluster capacity
Check:
kubectl describe pod coditect-combined-7 -n coditect-app
# Look for: "0/X nodes are available: insufficient cpu/memory"
Solution:
# Scale cluster
gcloud container clusters resize codi-poc-e2-cluster \
--num-nodes=8 --zone=us-central1-a
PVC Provisioning Slow​
Cause: GCE Persistent Disk provisioning delay (normal)
Check:
kubectl get events -n coditect-app --sort-by='.lastTimestamp' | grep -i provision
Solution: Wait 2-5 minutes for automatic provisioning
Autoscaling Not Working​
Cause: metrics-server not installed
Check:
kubectl top nodes
# If error: metrics-server not found
Solution:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
High CPU/RAM Usage​
Check actual vs configured:
kubectl top pods -n coditect-app
If consistently > 80%:
- Either: Increase pod resources (4 vCPU → 6 vCPU)
- Or: Let autoscaling add more pods (distributes load)
Pod Crashes / OOMKilled​
Check logs:
kubectl logs coditect-combined-3 -n coditect-app --previous
If OOMKilled:
- Increase memory limits: 8Gi → 12Gi
- Or reduce
NODE_OPTIONSheap size
Rollback​
If issues occur, rollback to minimal (3-pod) configuration:
# Apply original StatefulSet
kubectl apply -f k8s/theia-statefulset.yaml
# Delete HPA
kubectl delete hpa coditect-combined-hpa -n coditect-app
# Verify rollback
kubectl get statefulset coditect-combined -n coditect-app
# Should show 3 replicas
Note: PVCs for pods 3-9 will remain. Delete manually if needed:
kubectl delete pvc -n coditect-app -l app=coditect-combined
Cost Tracking​
Current Deployment​
# Check actual resource usage
kubectl top pods -n coditect-app
# Estimate cost
# 10 pods × (4 vCPU, 8 GB RAM) × $0.04/vCPU-hour × 730 hours/month
# = 40 vCPU × $29.20/month = ~$1,168/month (compute only)
#
# Storage: 10 pods × 110 GB × $0.17/GB-month = ~$187/month
# Load Balancer: ~$20/month
#
# Total: ~$1,375/month (if all 10 pods running 24/7)
#
# With autoscaling (10 min, 20 avg, 30 max):
# Average: ~$400-600/month
Optimize Costs​
- Right-size resources - Reduce if usage < 50%
- Adjust autoscaling - Lower max replicas if not needed
- Use preemptible nodes - 60-80% savings (dev/staging only)
Next Steps After Deployment​
- Monitor for 24 hours - Check autoscaling behavior
- Optimize resources - Adjust based on actual usage
- Set up alerts - Configure Prometheus/Grafana
- Load testing - Simulate 20 concurrent users
- Document runbooks - Incident response procedures
Related Documentation​
- Capacity Planning:
docs/11-analysis/gke-capacity-planning.md - StatefulSet Migration:
docs/11-analysis/STATEFULSET-migration-guide.md - Terraform Setup:
terraform/environments/prod/README.md
Quick Reference​
# Deploy Starter config
./scripts/preflight-check-starter.sh
./scripts/deploy-starter-config.sh
# Monitor
kubectl get pods -n coditect-app -l app=coditect-combined --watch
kubectl get hpa -n coditect-app --watch
kubectl top pods -n coditect-app
# Test
./scripts/test-persistence.sh
./scripts/test-session-affinity.sh
# Logs
kubectl logs -f coditect-combined-0 -n coditect-app
# Scale manually (if needed)
kubectl scale statefulset coditect-combined -n coditect-app --replicas=15
# Rollback
kubectl apply -f k8s/theia-statefulset.yaml
kubectl delete hpa coditect-combined-hpa -n coditect-app
Ready to deploy?
./scripts/preflight-check-starter.sh