Skip to main content

Coditect MVP Scaling Analysis - 20+ User Capacity

Date: 2025-10-27 Target: Support 20 concurrent users for pilot/beta testing, MVP launch Goals: Registration, sessions, ICP analysis, product-market fit validation

Current Configuration

StatefulSet Deployment

  • Replicas: 3 pods
  • Pod Management: Parallel (start all pods simultaneously)
  • Session Affinity: ClientIP (3-hour sticky sessions)
  • Graceful Shutdown: 120 seconds

Resources Per Pod

ResourceRequestLimit
Memory512 MB2 GB
CPU0.5 cores2 cores
workspace Storage50 GB50 GB
Config Storage5 GB5 GB

Current Capacity

  • 3 pods = 3-6 concurrent users (conservative estimate)
  • Assumption: 1-2 users per pod for IDE workloads with multiple llms
  • Bottleneck: CPU/Memory intensive (Monaco editor, TypeScript, llm inference)

❌ Current Gap: NOT Ready for 20 Users

You need 10-15 pods for 20 users (assuming 1.5-2 users per pod)

Scaling Recommendations for MVP Launch

Option 1: Manual Scaling (Quick Fix)

# Scale to 10 pods (supports ~15-20 users)
kubectl scale statefulset/coditect-combined --replicas=10 -n coditect-app

# Verify scaling
kubectl get pods -n coditect-app -l app=coditect-combined

# Check rollout status
kubectl rollout status statefulset/coditect-combined -n coditect-app

Time: 5-10 minutes
Cost Impact: 3x current cost (3 pods → 10 pods)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: coditect-combined-hpa
namespace: coditect-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: coditect-combined
minReplicas: 10 # Minimum for 20 users
maxReplicas: 30 # Headroom for peak usage
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # Scale up when CPU >70%
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75 # Scale up when Memory >75%
behavior:
scaleUp:
stabilizationWindowSeconds: 60 # Wait 1 min before scaling up
policies:
- type: Pods
value: 3 # Add 3 pods at a time
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300 # Wait 5 min before scaling down
policies:
- type: Pods
value: 1 # Remove 1 pod at a time
periodSeconds: 60

Benefits:

  • Auto-scales based on CPU/Memory load
  • Handles traffic spikes automatically
  • Cost-efficient (scales down during low usage)

Deployment:

kubectl apply -f k8s/coditect-combined-hpa.yaml
kubectl get hpa -n coditect-app -w # Watch autoscaling in action

Resource Requirements for 20 Users

Conservative Estimate (1.5 users/pod)

  • Pods needed: 14 pods
  • Total CPU: 7-28 cores (0.5-2 per pod)
  • Total Memory: 7-28 GB (0.5-2 per pod)
  • Total Storage: 770 GB (55 GB per pod × 14)

Optimistic Estimate (2 users/pod)

  • Pods needed: 10 pods
  • Total CPU: 5-20 cores
  • Total Memory: 5-20 GB
  • Total Storage: 550 GB (55 GB per pod × 10)

Cost Analysis (GCP us-central1)

Current Cost (3 pods)

  • Compute: $0.031/hour × 2 cores × 3 pods = ~$4.46/day
  • Storage: $0.10/GB/month × 165 GB = $16.50/month
  • Total: ~$150/month

Scaled Cost (10 pods for 20 users)

  • Compute: $0.031/hour × 2 cores × 10 pods = ~$14.88/day ($446/month)
  • Storage: $0.10/GB/month × 550 GB = $55/month
  • Total: ~$500/month

Scaled Cost (15 pods with headroom)

  • Compute: ~$22.32/day ($670/month)
  • Storage: $82.50/month
  • Total: ~$750/month

Cost per user: $25-37.50/month (at 20 users)

Architecture Considerations

Session Affinity (CRITICAL)

  • Current: ClientIP with 3-hour timeout
  • Impact: Users stick to same pod for 3 hours
  • Load Distribution: May not be perfectly balanced
  • Recommendation: Monitor pod CPU/Memory to detect imbalances

Persistent Storage (StatefulSet)

  • Each pod gets: 50 GB workspace + 5 GB config
  • Challenge: Scaling creates NEW storage volumes
  • Users can't switch pods (data is pod-specific)
  • Consideration: May need shared storage (NFS/GCS) for user data portability

Session Management

  • Sessions stored in: FoundationDB (not pod-local)
  • User can log in from any pod: Yes (FDB-backed auth)
  • workspace data: Pod-local (50 GB PVC per pod)
  • Recommendation: Consider migrating to shared storage for multi-pod access

Load Testing Checklist

Before MVP launch, validate capacity with:

  1. Concurrent User Test

    # Simulate 20 concurrent users
    # - Each user: 1 theia session, 2-3 llm requests/min, file edits
    # - Duration: 1 hour
    # - Monitor: CPU, Memory, response times
  2. Stress Test

    # Peak load: 30 users (50% over capacity)
    # Verify HPA scales up within 2-3 minutes
  3. Storage Test

    # Each user: 1 GB project, 100 file edits
    # Verify no storage exhaustion

MVP Launch Recommendations

Phase 1: Pre-Launch (Now)

  1. ✅ Scale to 10 pods immediately
    kubectl scale statefulset/coditect-combined --replicas=10 -n coditect-app
  2. ✅ Deploy HPA for auto-scaling (10-30 pods)
  3. ✅ Add monitoring alerts (CPU >80%, Memory >85%)
  4. ✅ Test with 5-10 beta users first

Phase 2: Beta Launch (Week 1)

  1. Onboard 10-15 users
  2. Monitor resource usage patterns
  3. Adjust HPA thresholds if needed
  4. Collect user feedback on performance

Phase 3: Full MVP (Week 2-4)

  1. Onboard remaining users to 20 total
  2. Monitor cost vs. actual usage
  3. Optimize pod resources based on real data
  4. Plan for next scaling tier (50-100 users)

Monitoring & Alerts

Critical Metrics

# CPU usage per pod
kubectl top pods -n coditect-app -l app=coditect-combined

# Memory usage per pod
kubectl top pods -n coditect-app -l app=coditect-combined --sort-by=memory

# Pod count and status
kubectl get pods -n coditect-app -l app=coditect-combined -o wide

# HPA status
kubectl get hpa -n coditect-app

# Session affinity distribution (check load balancer)
gcloud compute backend-services describe <backend-service-name> --global --format="yaml(sessionAffinity,affinityTimeout)"
  1. CPU >85% for 5 minutes → Scale up urgently
  2. Memory >90% for 5 minutes → Scale up urgently
  3. Pod count = maxReplicas → Increase HPA max
  4. Storage >90% on any PVC → Increase volume size
  5. User sessions > (pod_count × 2) → Not enough capacity

Next Steps

  1. Immediate: Scale to 10 pods (5 min)
  2. Deploy HPA: Create HPA config and apply (15 min)
  3. Load Test: Simulate 20 concurrent users (1-2 hours)
  4. Monitor: Track metrics for 24-48 hours before launch
  5. Adjust: Fine-tune HPA thresholds based on real usage

Conclusion

Current Status: ❌ NOT ready for 20 users (3 pods = 3-6 user capacity)

Recommended Action:

  1. Scale to 10 pods immediately
  2. Deploy HPA (10-30 range)
  3. Test with 5-10 beta users first
  4. Monitor and adjust before full 20-user launch

Estimated Cost: $500-750/month for 20 users (vs. $150/month for 3-6 users)

Timeline: Ready for 20 users in 1-2 days after scaling + testing