Coditect MVP Scaling Analysis - 20+ User Capacity
Date: 2025-10-27 Target: Support 20 concurrent users for pilot/beta testing, MVP launch Goals: Registration, sessions, ICP analysis, product-market fit validation
Current Configuration
StatefulSet Deployment
- Replicas: 3 pods
- Pod Management: Parallel (start all pods simultaneously)
- Session Affinity: ClientIP (3-hour sticky sessions)
- Graceful Shutdown: 120 seconds
Resources Per Pod
| Resource | Request | Limit |
|---|---|---|
| Memory | 512 MB | 2 GB |
| CPU | 0.5 cores | 2 cores |
| workspace Storage | 50 GB | 50 GB |
| Config Storage | 5 GB | 5 GB |
Current Capacity
- 3 pods = 3-6 concurrent users (conservative estimate)
- Assumption: 1-2 users per pod for IDE workloads with multiple llms
- Bottleneck: CPU/Memory intensive (Monaco editor, TypeScript, llm inference)
❌ Current Gap: NOT Ready for 20 Users
You need 10-15 pods for 20 users (assuming 1.5-2 users per pod)
Scaling Recommendations for MVP Launch
Option 1: Manual Scaling (Quick Fix)
# Scale to 10 pods (supports ~15-20 users)
kubectl scale statefulset/coditect-combined --replicas=10 -n coditect-app
# Verify scaling
kubectl get pods -n coditect-app -l app=coditect-combined
# Check rollout status
kubectl rollout status statefulset/coditect-combined -n coditect-app
Time: 5-10 minutes
Cost Impact: 3x current cost (3 pods → 10 pods)
Option 2: Horizontal Pod Autoscaler (HPA) - RECOMMENDED
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: coditect-combined-hpa
namespace: coditect-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: coditect-combined
minReplicas: 10 # Minimum for 20 users
maxReplicas: 30 # Headroom for peak usage
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # Scale up when CPU >70%
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75 # Scale up when Memory >75%
behavior:
scaleUp:
stabilizationWindowSeconds: 60 # Wait 1 min before scaling up
policies:
- type: Pods
value: 3 # Add 3 pods at a time
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300 # Wait 5 min before scaling down
policies:
- type: Pods
value: 1 # Remove 1 pod at a time
periodSeconds: 60
Benefits:
- Auto-scales based on CPU/Memory load
- Handles traffic spikes automatically
- Cost-efficient (scales down during low usage)
Deployment:
kubectl apply -f k8s/coditect-combined-hpa.yaml
kubectl get hpa -n coditect-app -w # Watch autoscaling in action
Resource Requirements for 20 Users
Conservative Estimate (1.5 users/pod)
- Pods needed: 14 pods
- Total CPU: 7-28 cores (0.5-2 per pod)
- Total Memory: 7-28 GB (0.5-2 per pod)
- Total Storage: 770 GB (55 GB per pod × 14)
Optimistic Estimate (2 users/pod)
- Pods needed: 10 pods
- Total CPU: 5-20 cores
- Total Memory: 5-20 GB
- Total Storage: 550 GB (55 GB per pod × 10)
Cost Analysis (GCP us-central1)
Current Cost (3 pods)
- Compute: $0.031/hour × 2 cores × 3 pods = ~$4.46/day
- Storage: $0.10/GB/month × 165 GB = $16.50/month
- Total: ~$150/month
Scaled Cost (10 pods for 20 users)
- Compute: $0.031/hour × 2 cores × 10 pods = ~$14.88/day ($446/month)
- Storage: $0.10/GB/month × 550 GB = $55/month
- Total: ~$500/month
Scaled Cost (15 pods with headroom)
- Compute: ~$22.32/day ($670/month)
- Storage: $82.50/month
- Total: ~$750/month
Cost per user: $25-37.50/month (at 20 users)
Architecture Considerations
Session Affinity (CRITICAL)
- Current: ClientIP with 3-hour timeout
- Impact: Users stick to same pod for 3 hours
- Load Distribution: May not be perfectly balanced
- Recommendation: Monitor pod CPU/Memory to detect imbalances
Persistent Storage (StatefulSet)
- Each pod gets: 50 GB workspace + 5 GB config
- Challenge: Scaling creates NEW storage volumes
- Users can't switch pods (data is pod-specific)
- Consideration: May need shared storage (NFS/GCS) for user data portability
Session Management
- Sessions stored in: FoundationDB (not pod-local)
- User can log in from any pod: Yes (FDB-backed auth)
- workspace data: Pod-local (50 GB PVC per pod)
- Recommendation: Consider migrating to shared storage for multi-pod access
Load Testing Checklist
Before MVP launch, validate capacity with:
-
Concurrent User Test
# Simulate 20 concurrent users
# - Each user: 1 theia session, 2-3 llm requests/min, file edits
# - Duration: 1 hour
# - Monitor: CPU, Memory, response times -
Stress Test
# Peak load: 30 users (50% over capacity)
# Verify HPA scales up within 2-3 minutes -
Storage Test
# Each user: 1 GB project, 100 file edits
# Verify no storage exhaustion
MVP Launch Recommendations
Phase 1: Pre-Launch (Now)
- ✅ Scale to 10 pods immediately
kubectl scale statefulset/coditect-combined --replicas=10 -n coditect-app - ✅ Deploy HPA for auto-scaling (10-30 pods)
- ✅ Add monitoring alerts (CPU >80%, Memory >85%)
- ✅ Test with 5-10 beta users first
Phase 2: Beta Launch (Week 1)
- Onboard 10-15 users
- Monitor resource usage patterns
- Adjust HPA thresholds if needed
- Collect user feedback on performance
Phase 3: Full MVP (Week 2-4)
- Onboard remaining users to 20 total
- Monitor cost vs. actual usage
- Optimize pod resources based on real data
- Plan for next scaling tier (50-100 users)
Monitoring & Alerts
Critical Metrics
# CPU usage per pod
kubectl top pods -n coditect-app -l app=coditect-combined
# Memory usage per pod
kubectl top pods -n coditect-app -l app=coditect-combined --sort-by=memory
# Pod count and status
kubectl get pods -n coditect-app -l app=coditect-combined -o wide
# HPA status
kubectl get hpa -n coditect-app
# Session affinity distribution (check load balancer)
gcloud compute backend-services describe <backend-service-name> --global --format="yaml(sessionAffinity,affinityTimeout)"
Recommended Alerts
- CPU >85% for 5 minutes → Scale up urgently
- Memory >90% for 5 minutes → Scale up urgently
- Pod count = maxReplicas → Increase HPA max
- Storage >90% on any PVC → Increase volume size
- User sessions > (pod_count × 2) → Not enough capacity
Next Steps
- Immediate: Scale to 10 pods (5 min)
- Deploy HPA: Create HPA config and apply (15 min)
- Load Test: Simulate 20 concurrent users (1-2 hours)
- Monitor: Track metrics for 24-48 hours before launch
- Adjust: Fine-tune HPA thresholds based on real usage
Conclusion
Current Status: ❌ NOT ready for 20 users (3 pods = 3-6 user capacity)
Recommended Action:
- Scale to 10 pods immediately
- Deploy HPA (10-30 range)
- Test with 5-10 beta users first
- Monitor and adjust before full 20-user launch
Estimated Cost: $500-750/month for 20 users (vs. $150/month for 3-6 users)
Timeline: Ready for 20 users in 1-2 days after scaling + testing