Coditect MVP Scaling Analysis - 20+ User Capacity

Date: 2025-10-27 Target: Support 20 concurrent users for pilot/beta testing, MVP launch Goals: Registration, sessions, ICP analysis, product-market fit validation

Current Configuration

StatefulSet Deployment

Replicas: 3 pods
Pod Management: Parallel (start all pods simultaneously)
Session Affinity: ClientIP (3-hour sticky sessions)
Graceful Shutdown: 120 seconds

Resources Per Pod

Resource	Request	Limit
Memory	512 MB	2 GB
CPU	0.5 cores	2 cores
workspace Storage	50 GB	50 GB
Config Storage	5 GB	5 GB

Current Capacity

3 pods = 3-6 concurrent users (conservative estimate)
Assumption: 1-2 users per pod for IDE workloads with multiple llms
Bottleneck: CPU/Memory intensive (Monaco editor, TypeScript, llm inference)

❌ Current Gap: NOT Ready for 20 Users

You need 10-15 pods for 20 users (assuming 1.5-2 users per pod)

Scaling Recommendations for MVP Launch

Option 1: Manual Scaling (Quick Fix)

# Scale to 10 pods (supports ~15-20 users)
kubectl scale statefulset/coditect-combined --replicas=10 -n coditect-app

# Verify scaling
kubectl get pods -n coditect-app -l app=coditect-combined

# Check rollout status
kubectl rollout status statefulset/coditect-combined -n coditect-app

Time: 5-10 minutes
Cost Impact: 3x current cost (3 pods → 10 pods)

Option 2: Horizontal Pod Autoscaler (HPA) - RECOMMENDED

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: coditect-combined-hpa
  namespace: coditect-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: coditect-combined
  minReplicas: 10    # Minimum for 20 users
  maxReplicas: 30    # Headroom for peak usage
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70  # Scale up when CPU >70%
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 75  # Scale up when Memory >75%
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60  # Wait 1 min before scaling up
      policies:
      - type: Pods
        value: 3  # Add 3 pods at a time
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300  # Wait 5 min before scaling down
      policies:
      - type: Pods
        value: 1  # Remove 1 pod at a time
        periodSeconds: 60

Benefits:

Auto-scales based on CPU/Memory load
Handles traffic spikes automatically
Cost-efficient (scales down during low usage)

Deployment:

kubectl apply -f k8s/coditect-combined-hpa.yaml
kubectl get hpa -n coditect-app -w  # Watch autoscaling in action

Resource Requirements for 20 Users

Conservative Estimate (1.5 users/pod)

Pods needed: 14 pods
Total CPU: 7-28 cores (0.5-2 per pod)
Total Memory: 7-28 GB (0.5-2 per pod)
Total Storage: 770 GB (55 GB per pod × 14)

Optimistic Estimate (2 users/pod)

Pods needed: 10 pods
Total CPU: 5-20 cores
Total Memory: 5-20 GB
Total Storage: 550 GB (55 GB per pod × 10)

Cost Analysis (GCP us-central1)

Current Cost (3 pods)

Compute: $0.031/hour × 2 cores × 3 pods = ~$4.46/day
Storage: $0.10/GB/month × 165 GB = $16.50/month
Total: ~$150/month

Scaled Cost (10 pods for 20 users)

Compute: $0.031/hour × 2 cores × 10 pods = ~$14.88/day ($446/month)
Storage: $0.10/GB/month × 550 GB = $55/month
Total: ~$500/month

Scaled Cost (15 pods with headroom)

Compute: ~$22.32/day ($670/month)
Storage: $82.50/month
Total: ~$750/month

Cost per user: $25-37.50/month (at 20 users)

Architecture Considerations

Session Affinity (CRITICAL)

Current: ClientIP with 3-hour timeout
Impact: Users stick to same pod for 3 hours
Load Distribution: May not be perfectly balanced
Recommendation: Monitor pod CPU/Memory to detect imbalances

Persistent Storage (StatefulSet)

Each pod gets: 50 GB workspace + 5 GB config
Challenge: Scaling creates NEW storage volumes
Users can't switch pods (data is pod-specific)
Consideration: May need shared storage (NFS/GCS) for user data portability

Session Management

Sessions stored in: FoundationDB (not pod-local)
User can log in from any pod: Yes (FDB-backed auth)
workspace data: Pod-local (50 GB PVC per pod)
Recommendation: Consider migrating to shared storage for multi-pod access

Load Testing Checklist

Before MVP launch, validate capacity with:

Concurrent User Test

# Simulate 20 concurrent users
# - Each user: 1 theia session, 2-3 llm requests/min, file edits
# - Duration: 1 hour
# - Monitor: CPU, Memory, response times

Stress Test

# Peak load: 30 users (50% over capacity)
# Verify HPA scales up within 2-3 minutes

Storage Test

# Each user: 1 GB project, 100 file edits
# Verify no storage exhaustion

MVP Launch Recommendations

Phase 1: Pre-Launch (Now)

✅ Scale to 10 pods immediately

kubectl scale statefulset/coditect-combined --replicas=10 -n coditect-app

✅ Deploy HPA for auto-scaling (10-30 pods)
✅ Add monitoring alerts (CPU >80%, Memory >85%)
✅ Test with 5-10 beta users first

Phase 2: Beta Launch (Week 1)

Onboard 10-15 users
Monitor resource usage patterns
Adjust HPA thresholds if needed
Collect user feedback on performance

Phase 3: Full MVP (Week 2-4)

Onboard remaining users to 20 total
Monitor cost vs. actual usage
Optimize pod resources based on real data
Plan for next scaling tier (50-100 users)

Monitoring & Alerts

Critical Metrics

# CPU usage per pod
kubectl top pods -n coditect-app -l app=coditect-combined

# Memory usage per pod
kubectl top pods -n coditect-app -l app=coditect-combined --sort-by=memory

# Pod count and status
kubectl get pods -n coditect-app -l app=coditect-combined -o wide

# HPA status
kubectl get hpa -n coditect-app

# Session affinity distribution (check load balancer)
gcloud compute backend-services describe <backend-service-name> --global --format="yaml(sessionAffinity,affinityTimeout)"

Recommended Alerts

CPU >85% for 5 minutes → Scale up urgently
Memory >90% for 5 minutes → Scale up urgently
Pod count = maxReplicas → Increase HPA max
Storage >90% on any PVC → Increase volume size
User sessions > (pod_count × 2) → Not enough capacity

Next Steps

Immediate: Scale to 10 pods (5 min)
Deploy HPA: Create HPA config and apply (15 min)
Load Test: Simulate 20 concurrent users (1-2 hours)
Monitor: Track metrics for 24-48 hours before launch
Adjust: Fine-tune HPA thresholds based on real usage

Conclusion

Current Status: ❌ NOT ready for 20 users (3 pods = 3-6 user capacity)

Recommended Action:

Scale to 10 pods immediately
Deploy HPA (10-30 range)
Test with 5-10 beta users first
Monitor and adjust before full 20-user launch

Estimated Cost: $500-750/month for 20 users (vs. $150/month for 3-6 users)

Timeline: Ready for 20 users in 1-2 days after scaling + testing

Current Configuration​

StatefulSet Deployment​

Resources Per Pod​

Current Capacity​

❌ Current Gap: NOT Ready for 20 Users​

Scaling Recommendations for MVP Launch​

Option 1: Manual Scaling (Quick Fix)​

Option 2: Horizontal Pod Autoscaler (HPA) - RECOMMENDED​

Resource Requirements for 20 Users​

Conservative Estimate (1.5 users/pod)​

Optimistic Estimate (2 users/pod)​

Cost Analysis (GCP us-central1)​

Current Cost (3 pods)​

Scaled Cost (10 pods for 20 users)​

Scaled Cost (15 pods with headroom)​

Architecture Considerations​

Session Affinity (CRITICAL)​

Persistent Storage (StatefulSet)​

Session Management​

Load Testing Checklist​

MVP Launch Recommendations​

Phase 1: Pre-Launch (Now)​

Phase 2: Beta Launch (Week 1)​

Phase 3: Full MVP (Week 2-4)​

Monitoring & Alerts​

Critical Metrics​

Recommended Alerts​

Next Steps​

Conclusion​