Skip to main content

Coditect V5 - MVP Automation Roadmap

Date: 2025-10-07 Purpose: Complete automation plan for multi-tenant, multi-user, multi-llm IDE with auto-provisioning Target: Production-ready MVP for beta testing Timeline: 4 weeks (can be parallelized with team)


Executive Summary

This roadmap addresses the complete automation stack required for a production MVP:

  • User Auto-Provisioning: Kubernetes operator creates user pods on signup
  • Payment Integration: Stripe credit card registration and subscription management
  • GCP Pod Auto-Provisioning: Automated namespace, RBAC, PVC, and pod creation
  • Data Persistence: FoundationDB for sessions/metadata, PVC for user files
  • Multi-Session Management: Per-user workspace isolation with session tabs
  • Multi-Tenant Architecture: Complete tenant isolation in FoundationDB
  • Multi-llm Support: Multiple llm providers in theia IDE
  • CI/CD Pipeline: Helm + ArgoCD + Cloud Build for GitOps
  • Monitoring & Scaling: Prometheus, Grafana, auto-scaling policies

Current Status: Backend API built and deployed (pods have FDB connection issue)


Table of Contents

  1. Phase 0: Critical Path (This Week)
  2. Phase 1: User Registration & Payment (Week 1)
  3. Phase 2: Automated Pod Provisioning (Week 2)
  4. Phase 3: CI/CD & GitOps (Week 2-3)
  5. Phase 4: Production Operations (Week 3-4)
  6. Phase 5: Beta Launch (Week 4)
  7. Architecture Diagrams
  8. Implementation Details
  9. Risk Mitigation
  10. Success Metrics

Phase 0: Critical Path (This Week)

Goal: Fix backend pods, build frontend wrapper, establish working end-to-end flow

0.1 Fix Backend Pod CrashLoopBackOff 🔴 URGENT

Issue: Backend API pods crash immediately due to FDB connection failure

Debugging Steps:

  1. Get pod logs with --previous flag:

    kubectl logs -n coditect-app coditect-api-v5-xxxxx --previous
  2. Test FDB connectivity from debug pod:

    kubectl run -n coditect-app fdb-test --image=busybox --rm -it -- sh
    # Inside pod:
    nc -zv foundationdb-0.fdb-cluster.coditect-app.svc.cluster.local 4500
  3. Check FDB cluster status:

    kubectl exec -n coditect-app foundationdb-0 -- fdbcli --exec "status"

Potential Fixes:

  • Option A: Make FDB connection optional in startup (fail gracefully)
  • Option B: Add retry logic with exponential backoff
  • Option C: Use FDB proxy service instead of direct pod connection

Priority: 🔴 HIGHEST - Blocking all downstream work

0.2 Build Frontend Wrapper (React + Chakra UI) ⬜

Components:

frontend/
├── src/
│ ├── app.tsx # Root app with routing
│ ├── components/
│ │ ├── header.tsx # Top navigation with user menu
│ │ ├── footer.tsx # Footer with links
│ │ ├── AuthForm.tsx # Login/Register modal
│ │ ├── theia-embed.tsx # theia iframe embed
│ │ └── session-tabs.tsx # Multi-session tabs
│ ├── pages/
│ │ ├── Landing.tsx # Marketing landing page
│ │ ├── Dashboard.tsx # User dashboard
│ │ ├── IDE.tsx # Main IDE interface
│ │ ├── Settings.tsx # User settings
│ │ └── Billing.tsx # Stripe billing portal
│ ├── services/
│ │ ├── authService.ts # JWT authentication
│ │ ├── apiService.ts # Backend API client
│ │ └── stripeService.ts # Stripe integration
│ └── store/
│ ├── auth-store.ts # Zustand auth state
│ ├── session-store.ts # Zustand session state
│ └── userStore.ts # Zustand user data
├── Dockerfile
├── package.json
└── vite.config.ts

Key Features:

  • Header: Logo, navigation, user avatar dropdown
  • Footer: Links, copyright, social media
  • Auth Modal: Login/register with Google/GitHub OAuth
  • theia Embed: Full-screen iframe with session isolation
  • Session Tabs: Browser tabs for multiple workspaces

Deliverable: Working frontend that connects to backend API and embeds theia

0.3 Build theia Container Image ⬜

Dockerfile (theia-app/):

FROM node:20-slim

# Install theia dependencies
RUN apt-get update && apt-get install -y \
git \
openssh-client \
python3 \
make \
g++ \
&& rm -rf /var/lib/apt/lists/*

WORKDIR /workspace

# Copy package.json and install
COPY theia-app/package.json theia-app/package-lock.json ./
RUN npm ci

# Build theia application
RUN npm run prepare

# Expose port
EXPOSE 3000

# Start theia
CMD ["npm", "start"]

Image Tag: us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/theia-ide:latest

0.4 Build WebSocket Sidecar ⬜

Purpose: Acts as proxy between theia and backend, validates JWT, forwards requests

Dockerfile (websocket-sidecar/):

FROM rust:1.90-slim as builder

WORKDIR /build
COPY websocket-sidecar/cargo.toml websocket-sidecar/Cargo.lock ./
COPY websocket-sidecar/src ./src

RUN cargo build --release

FROM debian:bookworm-slim
COPY --from=builder /build/target/release/ws-sidecar /usr/local/bin/
EXPOSE 8080
CMD ["ws-sidecar"]

Functionality:

  • Listen on port 8080 for WebSocket connections
  • Validate JWT from Authorization header
  • Forward authenticated requests to backend API
  • Maintain persistent connection to theia (localhost:3000)

Image Tag: us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/ws-sidecar:latest

Timeline: Phase 0 completion = 3-5 days


Phase 1: User Registration & Payment (Week 1)

Goal: Complete user signup flow with Stripe payment integration

1.1 Stripe Account Setup ⬜

Steps:

  1. Create Stripe account: https://dashboard.stripe.com/register

  2. Get API keys (test + production):

    • Test Secret Key: sk_test_...
    • Production Secret Key: sk_live_...
    • Publishable Key: pk_test_... / pk_live_...
  3. Create Products & Prices:

    Free Tier:
    - Price: $0
    - Limits: 100 llm requests/month, 1GB storage, 10 workspace hours

    Starter Tier:
    - Price: $29/month
    - Price ID: price_starter_monthly
    - Limits: 10K llm requests, 10GB storage, 100 workspace hours

    Pro Tier:
    - Price: $99/month
    - Price ID: price_pro_monthly
    - Limits: 100K llm requests, 100GB storage, 500 workspace hours
  4. Configure Webhooks:

    • Endpoint: https://api.coditect.ai/webhooks/stripe
    • Events: customer.subscription.created, customer.subscription.updated, customer.subscription.deleted, invoice.payment_failed, invoice.payment_succeeded
  5. Store keys in Google Secret Manager:

    echo -n "sk_live_..." | gcloud secrets create stripe-secret-key --data-file=-
    echo -n "pk_live_..." | gcloud secrets create stripe-publishable-key --data-file=-
    echo -n "whsec_..." | gcloud secrets create stripe-webhook-secret --data-file=-

1.2 Frontend Registration Form ⬜

Component: frontend/src/components/RegisterForm.tsx

import { useState } from 'react';
import { loadStripe } from '@stripe/stripe-js';
import { Elements, CardElement, useStripe, useElements } from '@stripe/react-stripe-js';

const stripePromise = loadStripe(process.env.VITE_STRIPE_PUBLISHABLE_KEY!);

export function RegisterForm() {
const stripe = useStripe();
const elements = useElements();
const [email, setEmail] = useState('');
const [password, setPassword] = useState('');
const [selectedPlan, setSelectedPlan] = useState<'free' | 'starter' | 'pro'>('free');

const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();

// Step 1: Create user account (backend)
const { user } = await apiService.register({ email, password });

// Step 2: If paid plan, collect payment
if (selectedPlan !== 'free') {
const cardElement = elements!.getElement(CardElement)!;
const { error, paymentMethod } = await stripe!.createPaymentMethod({
type: 'card',
card: cardElement,
});

if (error) {
console.error(error);
return;
}

// Step 3: Create Stripe subscription
await apiService.createSubscription({
user_id: user.user_id,
payment_method_id: paymentMethod.id,
price_id: getPriceId(selectedPlan),
});
}

// Step 4: Redirect to dashboard
navigate('/dashboard');
};

return (
<form onSubmit={handleSubmit}>
<input type="email" value={email} onChange={(e) => setEmail(e.target.value)} />
<input type="password" value={password} onChange={(e) => setPassword(e.target.value)} />

<select value={selectedPlan} onChange={(e) => setSelectedPlan(e.target.value as any)}>
<option value="free">Free ($0/month)</option>
<option value="starter">Starter ($29/month)</option>
<option value="pro">Pro ($99/month)</option>
</select>

{selectedPlan !== 'free' && (
<CardElement />
)}

<button type="submit">Create Account</button>
</form>
);
}

1.3 Backend Stripe Integration ⬜

Endpoint: POST /api/auth/register

// backend/src/handlers/auth.rs

use stripe::{Client, Customer, Subscription, CreateSubscription, CreateCustomer};

pub async fn register_user(
Json(payload): Json<RegisterRequest>,
State(state): State<Arc<AppState>>,
) -> Result<Json<RegisterResponse>, ApiError> {
// Step 1: Create user in FDB
let user = User::new(
payload.email.clone(),
payload.first_name,
payload.last_name,
hash_password(&payload.password)?,
);

state.user_repo.create(&user).await?;

// Step 2: Generate JWT tokens
let access_token = generate_access_token(&user)?;
let refresh_token = generate_refresh_token(&user)?;

// Step 3: If paid plan, create Stripe customer and subscription
if payload.plan != "free" {
let stripe_client = Client::new(std::env::var("STRIPE_SECRET_KEY")?);

// Create Stripe customer
let customer = Customer::create(&stripe_client, CreateCustomer {
email: Some(&payload.email),
metadata: Some([("user_id".to_string(), user.user_id.to_string())].into()),
..Default::default()
}).await?;

// Create subscription
let subscription = Subscription::create(&stripe_client, CreateSubscription {
customer: customer.id,
items: vec![CreateSubscriptionItems {
price: payload.price_id,
..Default::default()
}],
payment_behavior: Some("default_incomplete"),
payment_settings: Some(CreateSubscriptionPaymentSettings {
payment_method_types: Some(vec!["card"]),
..Default::default()
}),
..Default::default()
}).await?;

// Save subscription to FDB
let user_license = UserLicense::new(user.user_id, Uuid::new_v4(), None);
state.license_repo.save(&user_license).await?;

return Ok(Json(RegisterResponse {
user,
access_token,
refresh_token,
subscription_client_secret: subscription.latest_invoice.payment_intent.client_secret,
}));
}

// Free plan - no Stripe interaction
Ok(Json(RegisterResponse {
user,
access_token,
refresh_token,
subscription_client_secret: None,
}))
}

1.4 Stripe Webhook Handler ⬜

Endpoint: POST /webhooks/stripe

// backend/src/handlers/webhooks.rs

pub async fn stripe_webhook(
State(state): State<Arc<AppState>>,
headers: HeaderMap,
body: String,
) -> Result<StatusCode, ApiError> {
let signature = headers
.get("stripe-signature")
.and_then(|v| v.to_str().ok())
.ok_or(ApiError::BadRequest("Missing signature"))?;

let webhook_secret = std::env::var("STRIPE_WEBHOOK_SECRET")?;

// Verify webhook signature
let event = stripe::Webhook::construct_event(
&body,
signature,
&webhook_secret,
).map_err(|_| ApiError::Unauthorized("Invalid signature"))?;

match event.type_ {
EventType::CustomerSubscriptionCreated => {
let subscription: Subscription = serde_json::from_value(event.data.object)?;
handle_subscription_created(&state, subscription).await?;
}
EventType::CustomerSubscriptionUpdated => {
let subscription: Subscription = serde_json::from_value(event.data.object)?;
handle_subscription_updated(&state, subscription).await?;
}
EventType::CustomerSubscriptionDeleted => {
let subscription: Subscription = serde_json::from_value(event.data.object)?;
handle_subscription_cancelled(&state, subscription).await?;
}
EventType::InvoicePaymentFailed => {
let invoice: Invoice = serde_json::from_value(event.data.object)?;
handle_payment_failed(&state, invoice).await?;
}
_ => {}
}

Ok(StatusCode::OK)
}

async fn handle_subscription_created(
state: &AppState,
subscription: Subscription,
) -> Result<()> {
let user_id = subscription.metadata.get("user_id").unwrap();

// Update user license in FDB
let license = UserLicense {
user_id: Uuid::parse_str(user_id)?,
license_id: Uuid::new_v4(),
assigned_at: Utc::now(),
previous_license_id: None,
};

state.license_repo.save(&license).await?;

// Create license history
let history = LicenseHistory::new(
Uuid::parse_str(user_id)?,
license.license_id,
LicenseAction::Created,
"Subscription created via Stripe".to_string(),
);

state.license_repo.save_history(&history).await?;

Ok(())
}

1.5 OAuth Integration (Google + GitHub) ⬜

Google OAuth:

  1. Create OAuth client: https://console.cloud.google.com/apis/credentials
  2. Authorized redirect URI: https://coditect.ai/auth/google/callback
  3. Store credentials in Secret Manager

GitHub OAuth:

  1. Create OAuth app: https://github.com/settings/developers
  2. Authorization callback URL: https://coditect.ai/auth/github/callback
  3. Store credentials in Secret Manager

Backend Handler:

// backend/src/handlers/oauth.rs

pub async fn google_oauth_callback(
Query(params): Query<OAuthCallbackParams>,
State(state): State<Arc<AppState>>,
) -> Result<Redirect, ApiError> {
// Exchange code for token
let token_response = exchange_google_code(params.code).await?;

// Get user info from Google
let user_info = get_google_user_info(&token_response.access_token).await?;

// Check if user exists
let existing_user = state.user_repo.get_by_email(&user_info.email).await?;

let user = if let Some(user) = existing_user {
// Link OAuth provider to existing user
user
} else {
// Create new user
let new_user = User::new(
user_info.email,
user_info.given_name,
user_info.family_name,
String::new(), // No password for OAuth users
);
state.user_repo.create(&new_user).await?;
new_user
};

// Generate JWT
let access_token = generate_access_token(&user)?;

// Redirect to dashboard with token
Ok(Redirect::to(&format!("/dashboard?token={}", access_token)))
}

Timeline: Phase 1 completion = 5-7 days


Phase 2: Automated Pod Provisioning (Week 2)

Goal: Fully automated workspace provisioning on user signup/login

2.1 Kubernetes Provisioning Controller ⬜

Architecture:

User Signup/Login → Backend API → Provisioning Request → Provisioning Controller

Kubernetes API (create resources)

┌──────────────────────────────────────┐
│ 1. Create Namespace (user-{user_id}) │
│ 2. Create ServiceAccount │
│ 3. Create Role + RoleBinding │
│ 4. Create PVC (10GB) │
│ 5. Create workspace Pod │
│ - theia container (port 3000) │
│ - WS Sidecar (port 8080) │
│ 6. Wait for Pod Ready │
│ 7. Create Service (ClusterIP) │
│ 8. Create Ingress Rule │
└──────────────────────────────────────┘

Save workspaceAssignment to FDB

Return workspace URL to user

Implementation (provisioning-controller/):

// provisioning-controller/src/lib.rs

use kube::{Client, Api, api::{PostParams, DeleteParams}};
use k8s_openapi::api::core::v1::{Namespace, ServiceAccount, PersistentVolumeClaim, Pod, Service};
use k8s_openapi::api::rbac::v1::{Role, RoleBinding};
use k8s_openapi::api::networking::v1::Ingress;

pub struct ProvisioningController {
k8s_client: Client,
fdb_client: Arc<Database>,
}

impl ProvisioningController {
pub async fn new() -> Result<Self> {
let k8s_client = Client::try_default().await?;
let fdb_client = Arc::new(foundationdb::Database::default()?);
Ok(Self { k8s_client, fdb_client })
}

/// Provision a complete workspace for a user
pub async fn provision_workspace(
&self,
user_id: &str,
user_email: &str,
tenant_id: &str,
) -> Result<workspaceAssignment> {
let ns_name = format!("user-{}", user_id);

info!("Provisioning workspace for user: {}", user_id);

// Step 1: Create namespace
self.create_namespace(&ns_name).await?;
info!("✓ Created namespace: {}", ns_name);

// Step 2: Create ServiceAccount
self.create_service_account(&ns_name, "workspace-sa").await?;
info!("✓ Created service account");

// Step 3: Create RBAC (Role + RoleBinding)
self.create_role(&ns_name).await?;
self.create_role_binding(&ns_name, user_email).await?;
info!("✓ Created RBAC");

// Step 4: Create PVC for persistent storage
self.create_pvc(&ns_name, "workspace-pvc", "10Gi").await?;
info!("✓ Created PVC (10GB)");

// Step 5: Create workspace pod (theia + Sidecar)
let pod_name = self.create_workspace_pod(&ns_name, user_id, tenant_id).await?;
info!("✓ Created workspace pod: {}", pod_name);

// Step 6: Wait for pod ready (timeout: 2 minutes)
self.wait_for_pod_ready(&ns_name, &pod_name, Duration::from_secs(120)).await?;
info!("✓ Pod is ready");

// Step 7: Create ClusterIP service
self.create_service(&ns_name, &pod_name).await?;
info!("✓ Created service");

// Step 8: Create Ingress rule
let workspace_url = self.create_ingress_rule(&ns_name, user_id).await?;
info!("✓ Created ingress: {}", workspace_url);

// Step 9: Save assignment to FDB
let assignment = workspaceAssignment {
id: Uuid::new_v4(),
user_id: Uuid::parse_str(user_id)?,
tenant_id: Uuid::parse_str(tenant_id)?,
pod_name: pod_name.clone(),
namespace: ns_name.clone(),
assigned_at: Utc::now(),
last_active: Utc::now(),
status: workspaceStatus::Active,
resource_usage: ResourceUsage::default(),
};

self.save_assignment_to_fdb(&assignment).await?;
info!("✓ Saved assignment to FDB");

Ok(assignment)
}

async fn create_namespace(&self, name: &str) -> Result<()> {
let namespaces: Api<Namespace> = Api::all(self.k8s_client.clone());

let ns = serde_json::from_value(serde_json::json!({
"apiVersion": "v1",
"kind": "Namespace",
"metadata": {
"name": name,
"labels": {
"app": "coditect",
"type": "user-workspace"
}
}
}))?;

namespaces.create(&PostParams::default(), &ns).await?;
Ok(())
}

async fn create_service_account(&self, namespace: &str, sa_name: &str) -> Result<()> {
let service_accounts: Api<ServiceAccount> = Api::namespaced(
self.k8s_client.clone(),
namespace
);

let sa = serde_json::from_value(serde_json::json!({
"apiVersion": "v1",
"kind": "ServiceAccount",
"metadata": {
"name": sa_name,
"namespace": namespace
}
}))?;

service_accounts.create(&PostParams::default(), &sa).await?;
Ok(())
}

async fn create_role(&self, namespace: &str) -> Result<()> {
let roles: Api<Role> = Api::namespaced(self.k8s_client.clone(), namespace);

let role = serde_json::from_value(serde_json::json!({
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "Role",
"metadata": {
"name": "workspace-role",
"namespace": namespace
},
"rules": [
{
"apiGroups": [""],
"resources": ["pods", "pods/log", "pods/exec"],
"verbs": ["get", "list", "watch", "create"]
},
{
"apiGroups": [""],
"resources": ["persistentvolumeclaims"],
"verbs": ["get", "list"]
}
]
}))?;

roles.create(&PostParams::default(), &role).await?;
Ok(())
}

async fn create_role_binding(&self, namespace: &str, user_email: &str) -> Result<()> {
let role_bindings: Api<RoleBinding> = Api::namespaced(
self.k8s_client.clone(),
namespace
);

let rb = serde_json::from_value(serde_json::json!({
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "RoleBinding",
"metadata": {
"name": "workspace-role-binding",
"namespace": namespace
},
"subjects": [
{
"kind": "ServiceAccount",
"name": "workspace-sa",
"namespace": namespace
},
{
"kind": "User",
"name": user_email,
"apiGroup": "rbac.authorization.k8s.io"
}
],
"roleRef": {
"kind": "Role",
"name": "workspace-role",
"apiGroup": "rbac.authorization.k8s.io"
}
}))?;

role_bindings.create(&PostParams::default(), &rb).await?;
Ok(())
}

async fn create_pvc(&self, namespace: &str, pvc_name: &str, size: &str) -> Result<()> {
let pvcs: Api<PersistentVolumeClaim> = Api::namespaced(
self.k8s_client.clone(),
namespace
);

let pvc = serde_json::from_value(serde_json::json!({
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"metadata": {
"name": pvc_name,
"namespace": namespace
},
"spec": {
"accessModes": ["ReadWriteOnce"],
"resources": {
"requests": {
"storage": size
}
},
"storageClassName": "standard-rwo" // GCE Persistent Disk
}
}))?;

pvcs.create(&PostParams::default(), &pvc).await?;
Ok(())
}

async fn create_workspace_pod(
&self,
namespace: &str,
user_id: &str,
tenant_id: &str,
) -> Result<String> {
let pods: Api<Pod> = Api::namespaced(self.k8s_client.clone(), namespace);

let pod_name = format!("workspace-{}", &user_id[..8]);

let pod = serde_json::from_value(serde_json::json!({
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": pod_name,
"namespace": namespace,
"labels": {
"app": "coditect-workspace",
"user": user_id,
"tenant": tenant_id
}
},
"spec": {
"serviceAccountName": "workspace-sa",
"containers": [
{
"name": "theia",
"image": "us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/theia-ide:latest",
"ports": [{"containerPort": 3000}],
"volumeMounts": [
{
"name": "workspace-storage",
"mountPath": "/workspace"
}
],
"env": [
{"name": "USER_ID", "value": user_id},
{"name": "TENANT_ID", "value": tenant_id}
],
"resources": {
"requests": {"memory": "512Mi", "cpu": "500m"},
"limits": {"memory": "2Gi", "cpu": "2000m"}
}
},
{
"name": "ws-sidecar",
"image": "us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/ws-sidecar:latest",
"ports": [{"containerPort": 8080}],
"env": [
{"name": "BACKEND_API_URL", "value": "http://coditect-api-v5-service.coditect-app.svc.cluster.local:8000"},
{"name": "THEIA_URL", "value": "http://localhost:3000"},
{"name": "FDB_CLUSTER_FILE", "value": "/etc/foundationdb/fdb.cluster"}
],
"resources": {
"requests": {"memory": "128Mi", "cpu": "100m"},
"limits": {"memory": "256Mi", "cpu": "200m"}
}
}
],
"volumes": [
{
"name": "workspace-storage",
"persistentVolumeClaim": {
"claimName": "workspace-pvc"
}
}
]
}
}))?;

pods.create(&PostParams::default(), &pod).await?;
Ok(pod_name)
}

async fn wait_for_pod_ready(
&self,
namespace: &str,
pod_name: &str,
timeout: Duration,
) -> Result<()> {
let pods: Api<Pod> = Api::namespaced(self.k8s_client.clone(), namespace);

let start = Instant::now();
loop {
if start.elapsed() > timeout {
return Err(anyhow::anyhow!("Timeout waiting for pod to be ready"));
}

let pod = pods.get(pod_name).await?;
if let Some(status) = pod.status {
if let Some(conditions) = status.conditions {
for condition in conditions {
if condition.type_ == "Ready" && condition.status == "True" {
return Ok(());
}
}
}
}

tokio::time::sleep(Duration::from_secs(2)).await;
}
}

async fn create_service(&self, namespace: &str, pod_name: &str) -> Result<()> {
let services: Api<Service> = Api::namespaced(self.k8s_client.clone(), namespace);

let svc = serde_json::from_value(serde_json::json!({
"apiVersion": "v1",
"kind": "Service",
"metadata": {
"name": format!("{}-service", pod_name),
"namespace": namespace
},
"spec": {
"selector": {
"app": "coditect-workspace"
},
"ports": [
{"name": "theia", "port": 3000, "targetPort": 3000},
{"name": "ws-sidecar", "port": 8080, "targetPort": 8080}
],
"type": "ClusterIP"
}
}))?;

services.create(&PostParams::default(), &svc).await?;
Ok(())
}

async fn create_ingress_rule(&self, namespace: &str, user_id: &str) -> Result<String> {
let ingresses: Api<Ingress> = Api::namespaced(self.k8s_client.clone(), namespace);

let subdomain = format!("{}.coditect.ai", &user_id[..8]);

let ingress = serde_json::from_value(serde_json::json!({
"apiVersion": "networking.k8s.io/v1",
"kind": "Ingress",
"metadata": {
"name": "workspace-ingress",
"namespace": namespace,
"annotations": {
"kubernetes.io/ingress.class": "nginx",
"cert-manager.io/cluster-issuer": "letsencrypt-prod",
"nginx.ingress.kubernetes.io/websocket-services": format!("workspace-{}-service", &user_id[..8])
}
},
"spec": {
"tls": [
{
"hosts": [subdomain.clone()],
"secretName": format!("{}-tls", &user_id[..8])
}
],
"rules": [
{
"host": subdomain.clone(),
"http": {
"paths": [
{
"path": "/",
"pathType": "Prefix",
"backend": {
"service": {
"name": format!("workspace-{}-service", &user_id[..8]),
"port": {"number": 3000}
}
}
}
]
}
}
]
}
}))?;

ingresses.create(&PostParams::default(), &ingress).await?;
Ok(format!("https://{}", subdomain))
}

async fn save_assignment_to_fdb(&self, assignment: &workspaceAssignment) -> Result<()> {
let key = format!("workspaces/{}", assignment.id);
let data = serde_json::to_vec(assignment)?;

let trx = self.fdb_client.create_trx()?;
trx.set(key.as_bytes(), &data);
trx.commit().await?;

Ok(())
}

/// Deprovision a workspace (cleanup all resources)
pub async fn deprovision_workspace(&self, user_id: &str) -> Result<()> {
let ns_name = format!("user-{}", user_id);

info!("Deprovisioning workspace for user: {}", user_id);

let namespaces: Api<Namespace> = Api::all(self.k8s_client.clone());
namespaces.delete(&ns_name, &DeleteParams::default()).await?;

info!("✓ Deleted namespace: {}", ns_name);

// Remove from FDB
let trx = self.fdb_client.create_trx()?;
let key = format!("workspaces/{}", user_id);
trx.clear(key.as_bytes());
trx.commit().await?;

Ok(())
}
}

2.2 Integrate Provisioning into Backend API ⬜

Endpoint: POST /api/workspaces/provision

// backend/src/handlers/workspaces.rs

pub async fn provision_workspace(
Extension(user): Extension<User>,
State(state): State<Arc<AppState>>,
) -> Result<Json<ProvisionResponse>, ApiError> {
// Check if user already has a workspace
let existing = state.workspace_repo.get_by_user(&user.user_id).await?;
if let Some(assignment) = existing {
return Ok(Json(ProvisionResponse {
workspace_url: format!("https://{}.coditect.ai", &user.user_id.to_string()[..8]),
status: "active",
pod_name: assignment.pod_name,
}));
}

// Provision new workspace
let controller = ProvisioningController::new().await?;
let assignment = controller.provision_workspace(
&user.user_id.to_string(),
&user.email,
&user.primary_tenant_id.to_string(),
).await?;

Ok(Json(ProvisionResponse {
workspace_url: format!("https://{}.coditect.ai", &user.user_id.to_string()[..8]),
status: "provisioning",
pod_name: assignment.pod_name,
}))
}

2.3 Idle Pod Cleanup CronJob ⬜

CronJob Manifest (k8s/idle-cleanup-cronjob.yaml):

apiVersion: batch/v1
kind: CronJob
metadata:
name: idle-workspace-cleanup
namespace: coditect-app
spec:
schedule: "0 * * * *" # Run every hour
jobTemplate:
spec:
template:
spec:
serviceAccountName: provisioning-controller-sa
containers:
- name: cleanup
image: us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/idle-cleanup:latest
env:
- name: IDLE_THRESHOLD_HOURS
value: "2"
- name: FDB_CLUSTER_FILE
value: "/etc/foundationdb/fdb.cluster"
volumeMounts:
- name: fdb-cluster
mountPath: /etc/foundationdb
volumes:
- name: fdb-cluster
secret:
secretName: fdb-cluster-file
restartPolicy: OnFailure

Cleanup Logic:

// idle-cleanup/src/main.rs

#[tokio::main]
async fn main() -> Result<()> {
let idle_threshold = Duration::hours(
std::env::var("IDLE_THRESHOLD_HOURS")?.parse()?
);

let fdb_client = foundationdb::Database::default()?;
let k8s_client = Client::try_default().await?;
let controller = ProvisioningController::new().await?;

// Query all workspace assignments from FDB
let trx = fdb_client.create_trx()?;
let range = trx.get_range(&RangeOption::from("workspaces/".as_bytes()..)).await?;

let now = Utc::now();
let mut terminated_count = 0;

for kv in range {
let assignment: workspaceAssignment = serde_json::from_slice(&kv.value())?;

if assignment.status == workspaceStatus::Active {
let idle_duration = now - assignment.last_active;

if idle_duration > idle_threshold {
info!("Terminating idle workspace: user={}, idle_for={}h",
assignment.user_id, idle_duration.num_hours());

// Deprovision workspace
controller.deprovision_workspace(&assignment.user_id.to_string()).await?;

terminated_count += 1;
}
}
}

info!("Cleanup complete. Terminated {} idle workspaces.", terminated_count);
Ok(())
}

Timeline: Phase 2 completion = 7-10 days


Phase 3: CI/CD & GitOps (Week 2-3)

Goal: Fully automated deployment pipeline with Helm + ArgoCD

3.1 Helm Chart Structure ⬜

helm/coditect/
├── Chart.yaml
├── values.yaml
├── values-staging.yaml
├── values-prod.yaml
└── templates/
├── _helpers.tpl # Template helpers
├── backend/
│ ├── deployment.yaml # Backend API deployment
│ ├── service.yaml # Backend service
│ ├── hpa.yaml # HorizontalPodAutoscaler
│ └── secrets.yaml # Secrets (JWT, Stripe, etc.)
├── foundationdb/
│ ├── statefulset.yaml # FDB StatefulSet
│ ├── service.yaml # FDB service
│ └── pvc.yaml # FDB persistent volume claims
├── provisioning-controller/
│ ├── deployment.yaml # Controller deployment
│ ├── serviceaccount.yaml # ServiceAccount
│ ├── clusterrole.yaml # ClusterRole (k8s API access)
│ └── clusterrolebinding.yaml # ClusterRoleBinding
├── idle-cleanup/
│ └── cronjob.yaml # Idle workspace cleanup job
├── ingress/
│ ├── ingress.yaml # NGINX Ingress
│ └── certificate.yaml # cert-manager Certificate
└── monitoring/
├── servicemonitor.yaml # Prometheus ServiceMonitor
└── grafana-dashboard.yaml # Grafana dashboard

Chart.yaml:

apiVersion: v2
name: coditect
description: Coditect Multi-llm IDE Platform
version: 1.0.0
appVersion: "v5.0.0"
dependencies:
- name: foundationdb
version: "7.1.x"
repository: "https://charts.foundationdb.org"

values.yaml (defaults):

# Backend API
backend:
replicaCount: 3
image:
repository: us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/backend-api
tag: "latest"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8000
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70

# theia IDE (workspace pod template)
theia:
image:
repository: us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/theia-ide
tag: "latest"
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2000m"

# WebSocket Sidecar
wsSidecar:
image:
repository: us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/ws-sidecar
tag: "latest"
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"

# Provisioning Controller
provisioningController:
replicaCount: 2
image:
repository: us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/provisioning-controller
tag: "latest"

# FoundationDB
foundationdb:
replicas: 3
storageClass: standard-rwo
storageSize: 100Gi

# Ingress
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: api.coditect.ai
paths:
- path: /
pathType: Prefix
tls:
- secretName: coditect-api-tls
hosts:
- api.coditect.ai

# Secrets (from Google Secret Manager)
secrets:
jwtSecret: "" # Injected by CI/CD
stripeSecretKey: "" # Injected by CI/CD
stripeWebhookSecret: "" # Injected by CI/CD
fdbClusterFile: "" # Injected by CI/CD

values-prod.yaml (production overrides):

backend:
replicaCount: 5
autoscaling:
minReplicas: 5
maxReplicas: 20

foundationdb:
replicas: 5
storageSize: 500Gi

ingress:
hosts:
- host: api.coditect.ai
paths:
- path: /
pathType: Prefix

3.2 ArgoCD Application Setup ⬜

Install ArgoCD:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Wait for ArgoCD to be ready
kubectl wait --for=condition=available --timeout=600s deployment/argocd-server -n argocd

# Get initial admin password
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

# Port-forward to access UI
kubectl port-forward svc/argocd-server -n argocd 8080:443

Create ArgoCD Application (argocd/coditect-prod.yaml):

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: coditect-prod
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/coditect-ai/Coditect-v5-multiple-llm-IDE
targetRevision: main
path: helm/coditect
helm:
valueFiles:
- values-prod.yaml
parameters:
- name: backend.image.tag
value: "${IMAGE_TAG}" # Injected by CI/CD
destination:
server: https://kubernetes.default.svc
namespace: coditect-app
syncPolicy:
automated:
prune: true # Delete resources removed from Git
selfHeal: true # Auto-sync if cluster state drifts
syncOptions:
- CreateNamespace=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m

Deploy ArgoCD Application:

kubectl apply -f argocd/coditect-prod.yaml

3.3 Cloud Build CI/CD Pipeline ⬜

cloudbuild-v5-full.yaml:

steps:
# Build all Docker images in parallel
- name: 'gcr.io/cloud-builders/docker'
id: build-backend
args:
- 'build'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/backend-api:$SHORT_SHA'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/backend-api:latest'
- './backend'

- name: 'gcr.io/cloud-builders/docker'
id: build-theia
args:
- 'build'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/theia-ide:$SHORT_SHA'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/theia-ide:latest'
- './theia-app'

- name: 'gcr.io/cloud-builders/docker'
id: build-ws-sidecar
args:
- 'build'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/ws-sidecar:$SHORT_SHA'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/ws-sidecar:latest'
- './websocket-sidecar'

- name: 'gcr.io/cloud-builders/docker'
id: build-provisioning-controller
args:
- 'build'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/provisioning-controller:$SHORT_SHA'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/provisioning-controller:latest'
- './provisioning-controller'

- name: 'gcr.io/cloud-builders/docker'
id: build-idle-cleanup
args:
- 'build'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/idle-cleanup:$SHORT_SHA'
- '-t'
- 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/idle-cleanup:latest'
- './idle-cleanup'

# Push all images
- name: 'gcr.io/cloud-builders/docker'
id: push-backend
waitFor: ['build-backend']
args: ['push', '--all-tags', 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/backend-api']

- name: 'gcr.io/cloud-builders/docker'
id: push-theia
waitFor: ['build-theia']
args: ['push', '--all-tags', 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/theia-ide']

- name: 'gcr.io/cloud-builders/docker'
id: push-ws-sidecar
waitFor: ['build-ws-sidecar']
args: ['push', '--all-tags', 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/ws-sidecar']

- name: 'gcr.io/cloud-builders/docker'
id: push-provisioning-controller
waitFor: ['build-provisioning-controller']
args: ['push', '--all-tags', 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/provisioning-controller']

- name: 'gcr.io/cloud-builders/docker'
id: push-idle-cleanup
waitFor: ['build-idle-cleanup']
args: ['push', '--all-tags', 'us-central1-docker.pkg.dev/$PROJECT_ID/coditect/idle-cleanup']

# Update Helm values with new image tags
- name: 'gcr.io/cloud-builders/git'
id: git-config
waitFor: ['push-backend', 'push-theia', 'push-ws-sidecar', 'push-provisioning-controller', 'push-idle-cleanup']
entrypoint: 'bash'
args:
- '-c'
- |
git config user.email "cloud-build@coditect.ai"
git config user.name "Cloud Build Bot"

- name: 'gcr.io/cloud-builders/git'
id: update-helm-values
waitFor: ['git-config']
entrypoint: 'bash'
args:
- '-c'
- |
sed -i "s|backend-api:.*$|backend-api:$SHORT_SHA|" helm/coditect/values-prod.yaml
sed -i "s|theia-ide:.*$|theia-ide:$SHORT_SHA|" helm/coditect/values-prod.yaml
sed -i "s|ws-sidecar:.*$|ws-sidecar:$SHORT_SHA|" helm/coditect/values-prod.yaml
sed -i "s|provisioning-controller:.*$|provisioning-controller:$SHORT_SHA|" helm/coditect/values-prod.yaml
sed -i "s|idle-cleanup:.*$|idle-cleanup:$SHORT_SHA|" helm/coditect/values-prod.yaml

git add helm/coditect/values-prod.yaml
git commit -m "chore: Update production images to $SHORT_SHA [skip ci]"
git push origin main

# Trigger ArgoCD sync (if automated sync is disabled)
- name: 'gcr.io/cloud-builders/gcloud'
id: argocd-sync
waitFor: ['update-helm-values']
entrypoint: 'bash'
args:
- '-c'
- |
kubectl exec -n argocd deployment/argocd-server -- \
argocd app sync coditect-prod --prune --force

timeout: '3600s'
options:
machineType: 'N1_HIGHCPU_8'
diskSizeGb: 100
logging: CLOUD_LOGGING_ONLY

Cloud Build Trigger (automatic on Git push):

gcloud builds triggers create github \
--repo-name=Coditect-v5-multiple-llm-IDE \
--repo-owner=coditect-ai \
--branch-pattern="^main$" \
--build-config=cloudbuild-v5-full.yaml \
--description="Coditect V5 CI/CD Pipeline"

Timeline: Phase 3 completion = 7-10 days (can run parallel with Phase 2)


Phase 4: Production Operations (Week 3-4)

4.1 Monitoring Stack (Prometheus + Grafana) ⬜

Install Prometheus Operator:

kubectl create namespace monitoring
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set prometheus.prometheusSpec.retention=30d \
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=100Gi

ServiceMonitor for Backend API:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: coditect-backend
namespace: coditect-app
spec:
selector:
matchLabels:
app: coditect-api-v5
endpoints:
- port: http
path: /metrics
interval: 30s

Grafana Dashboard (import dashboard JSON):

  • Pod CPU/Memory usage
  • Request latency (p50, p95, p99)
  • Error rates (4xx, 5xx)
  • FoundationDB metrics
  • workspace provisioning duration

4.2 Logging (Cloud Logging + Fluentd) ⬜

Fluentd DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-system
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1-debian-cloudlogging
env:
- name: FLUENT_CLOUDLOGGING_USE_JSON
value: "true"
- name: FLUENT_CLOUDLOGGING_USE_METADATA
value: "true"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers

Log Queries (Cloud Logging):

# All backend API errors
resource.type="k8s_container"
resource.labels.namespace_name="coditect-app"
resource.labels.container_name="api"
severity>=ERROR

# workspace provisioning failures
resource.type="k8s_container"
resource.labels.namespace_name="coditect-app"
jsonPayload.message=~"provisioning failed"

# Slow requests (>1s latency)
resource.type="k8s_container"
resource.labels.namespace_name="coditect-app"
jsonPayload.latency_ms>1000

4.3 Alerting (Prometheus Alertmanager + PagerDuty) ⬜

PrometheusRule (alerts.yaml):

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: coditect-alerts
namespace: coditect-app
spec:
groups:
- name: coditect
interval: 30s
rules:
# High error rate
- alert: HighErrorRate
expr: |
sum(rate(http_requests_total{status=~"5.."}[5m])) /
sum(rate(http_requests_total[5m])) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected (>5%)"
description: "Backend API error rate is {{ $value | humanizePercentage }}"

# Pod crash loop
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} is crash looping"

# High memory usage
- alert: HighMemoryUsage
expr: |
container_memory_usage_bytes{namespace="coditect-app"} /
container_spec_memory_limit_bytes{namespace="coditect-app"} > 0.9
for: 10m
labels:
severity: warning
annotations:
summary: "Container {{ $labels.container }} high memory usage"
description: "Memory usage is {{ $value | humanizePercentage }}"

# FoundationDB down
- alert: FoundationDBDown
expr: up{job="foundationdb"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "FoundationDB instance down"
description: "FoundationDB pod {{ $labels.pod }} is down"

# workspace provisioning failures
- alert: ProvisioningFailures
expr: rate(provisioning_failures_total[10m]) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "High workspace provisioning failure rate"
description: "Failure rate: {{ $value }}/min"

PagerDuty Integration:

# alertmanager-config.yaml
global:
resolve_timeout: 5m

route:
receiver: 'pagerduty'
group_by: ['alertname', 'cluster', 'service']
group_wait: 10s
group_interval: 10s
repeat_interval: 12h

receivers:
- name: 'pagerduty'
pagerduty_configs:
- service_key: '<PAGERDUTY_SERVICE_KEY>'
description: '{{ .GroupLabels.alertname }}: {{ .Annotations.summary }}'

4.4 Auto-Scaling Policies ⬜

HorizontalPodAutoscaler (Backend API):

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: coditect-api-hpa
namespace: coditect-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: coditect-api-v5
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100 # Double pods
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50 # Half pods
periodSeconds: 60

Cluster Autoscaler (GKE):

gcloud container clusters update codi-poc-e2-cluster \
--enable-autoscaling \
--min-nodes=3 \
--max-nodes=50 \
--zone=us-central1-a

Timeline: Phase 4 completion = 5-7 days (can run parallel with Phase 3)


Phase 5: Beta Launch (Week 4)

5.1 Pre-Launch Checklist ✅

  • Backend API pods running without crashes
  • Frontend deployed and accessible
  • Stripe billing integrated and tested
  • OAuth (Google + GitHub) working
  • Automated pod provisioning tested
  • Multi-session architecture working
  • Multi-llm support in theia verified
  • FoundationDB replication (3+ nodes)
  • SSL certificates valid (coditect.ai)
  • Monitoring dashboards configured
  • Alerting rules active (PagerDuty)
  • Backup/restore procedures documented
  • Load testing completed (500+ concurrent users)
  • Security audit passed
  • Terms of Service + Privacy Policy published

5.2 Beta User Onboarding Flow ⬜

Landing Page (coditect.ai):

  1. Hero section: "Multi-llm IDE in Your Browser"
  2. Features: 16+ llms, Multi-session tabs, AI agents, Cloud persistence
  3. Pricing: Free, Starter ($29), Pro ($99)
  4. CTA: "Start Free Trial"

Registration Flow:

  1. User clicks "Start Free Trial"
  2. Modal: Email/password OR Google/GitHub OAuth
  3. If paid plan: Stripe Checkout (collect credit card)
  4. Create account → Generate JWT
  5. Provision workspace (automated via controller)
  6. Redirect to dashboard → Show workspace provisioning progress
  7. When ready: Redirect to theia IDE

Dashboard (dashboard.coditect.ai):

  • workspace status (Active/Idle/Provisioning)
  • Usage stats (llm requests, storage, hours)
  • Billing info (current plan, next invoice)
  • Settings (API keys, OAuth connections)

5.3 Beta Testing Plan ⬜

Phase 1: Internal Testing (Days 1-3)

  • 10 team members
  • Test all features end-to-end
  • Document bugs in GitHub Issues

Phase 2: Closed Beta (Days 4-10)

  • 50 invited beta testers (developers, AI enthusiasts)
  • Collect feedback via Typeform survey
  • Monitor errors in Sentry

Phase 3: Open Beta (Days 11-30)

  • Open to public (200+ users)
  • Post on Product Hunt, Hacker News, Reddit
  • Monitor scaling and performance

Success Metrics:

  • Signup Rate: >50% of landing page visitors
  • Activation Rate: >80% successfully provision workspace
  • Retention (7-day): >40% return after first session
  • Error Rate: <1% of API requests
  • Latency (p95): <500ms for API calls
  • Provisioning Time (p95): <2 minutes

5.4 Launch Communications ⬜

Channels:

  1. Product Hunt: Prepare launch post + demo video
  2. Hacker News: "Show HN: Multi-llm IDE with 16+ models in browser"
  3. Reddit: r/programming, r/MachineLearning, r/webdev
  4. Twitter/X: Announcement thread with screenshots
  5. LinkedIn: Professional announcement
  6. Email: Notify waitlist subscribers

Press Kit:

  • Logo (PNG, SVG)
  • Screenshots (dark + light themes)
  • Demo video (2-3 minutes)
  • Founder bios
  • Press release

Timeline: Phase 5 completion = 30 days (beta testing period)


Architecture Diagrams

End-to-End System Architecture

┌─────────────────────────────────────────────────────────────────────┐
│ User's Browser │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ React Frontend (coditect.ai) │ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────────────┐ │ │
│ │ │ Header │ │ Session │ │ theia Embed │ │ │
│ │ │ Nav/Auth │ │ Tabs │ │ (iframe) │ │ │
│ │ └────────────┘ └────────────┘ └────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │ │ │
│ │ HTTPS (JWT) │ WSS (WebSocket) │
└─────────┼────────────────────────────────────┼──────────────────────┘
│ │
▼ ▼
┌──────────────────────────┐ ┌──────────────────────────────────────┐
│ Backend API │ │ User workspace Pod │
│ (GKE - coditect-app) │ │ (GKE - user-{user_id} namespace) │
│ ┌────────────────────┐ │ │ ┌────────────────┐ ┌─────────────┐ │
│ │ Auth Service │ │ │ │ theia IDE │ │ WS Sidecar │ │
│ │ (JWT/OAuth) │ │ │ │ (port 3000) │ │ (port 8080) │ │
│ ├────────────────────┤ │ │ └────────────────┘ └─────────────┘ │
│ │ User Service │ │ │ ↓ ↓ │
│ │ (CRUD) │ │ │ localhost:3000 localhost:8080 │
│ ├────────────────────┤ │ │ │ │ │
│ │ workspace Service │◄─┼────┼─────────┼─────────────────┘ │
│ │ (Provisioning) │ │ │ │ │
│ ├────────────────────┤ │ │ ▼ │
│ │ Billing Service │ │ │ ┌──────────────────────┐ │
│ │ (Stripe) │ │ │ │ PVC (10GB) │ │
│ └────────────────────┘ │ │ │ /workspace │ │
│ │ │ │ └──────────────────────┘ │
│ │ │ └──────────────────────────────────────┘
│ ▼ │
│ ┌────────────────────┐ │ ┌──────────────────────────────────────┐
│ │ FoundationDB │◄─┼────│ Provisioning Controller │
│ │ (StatefulSet) │ │ │ (Kubernetes Operator) │
│ │ - Users │ │ │ ┌────────────────────────────────┐ │
│ │ - Sessions │ │ │ │ Watches: User registrations │ │
│ │ - workspaces │ │ │ │ Creates: NS + RBAC + PVC + Pod│ │
│ │ - Licenses │ │ │ └────────────────────────────────┘ │
│ └────────────────────┘ │ └──────────────────────────────────────┘
└──────────────────────────┘

Monitoring & Observability
┌──────────────────────────────────────────────────────────────────────┐
│ Prometheus + Grafana + Alertmanager → PagerDuty │
│ Cloud Logging + Fluentd → BigQuery │
└──────────────────────────────────────────────────────────────────────┘

User Registration & Provisioning Flow

1. User Visits coditect.ai

2. Clicks "Sign Up" → Selects Plan (Free/Starter/Pro)

3. Enters Email/Password OR Google/GitHub OAuth

4. If Paid Plan: Stripe Checkout (collect credit card)

5. Backend API: Create User in FDB

6. If Paid: Create Stripe Customer + Subscription

7. Generate JWT (access + refresh tokens)

8. Trigger workspace Provisioning (POST /api/workspaces/provision)

9. Provisioning Controller Executes:
├─ Create Namespace (user-{user_id})
├─ Create ServiceAccount + RBAC
├─ Create PVC (10GB)
├─ Create Pod (theia + WS Sidecar)
├─ Wait for Pod Ready (2 min timeout)
├─ Create Service (ClusterIP)
├─ Create Ingress ({user_id}.coditect.ai)
└─ Save workspaceAssignment to FDB

10. Return workspace URL to frontend

11. Frontend polls /api/workspaces/status until "ready"

12. Redirect user to IDE: https://{user_id}.coditect.ai

13. User opens theia → Multi-session IDE ready

Implementation Details

Technologies & Tools

ComponentTechnologyVersionPurpose
FrontendReact + TypeScript18 + 5.3UI layer
UI FrameworkChakra UI2.8Component library
State ManagementZustand4.4Client state
IDEEclipse theia1.65+Browser IDE
BackendRust + Actix-web1.90 + 4.9API server
DatabaseFoundationDB7.1.27Primary persistence
AuthJWT + OAuth 2.0-Authentication
PaymentStripeLatestBilling
Container OrchestrationKubernetes (GKE)1.31+Pod management
ProvisioningRust + kube-rs0.95+K8s operator
CI/CDCloud Build + ArgoCD-GitOps
Package ManagerHelm3.16+K8s deployments
MonitoringPrometheus + Grafana-Metrics
LoggingCloud Logging + Fluentd-Log aggregation
AlertingAlertmanager + PagerDuty-Incident management

Development Workflow

Local Development:

# Backend API
cd backend
cargo run

# Frontend
cd frontend
npm run dev

# theia (local testing)
cd theia-app
npm run start

Testing:

# Unit tests
cd backend && cargo test
cd frontend && npm test

# Integration tests
cd tests && pytest

# E2E tests (Playwright)
cd tests/e2e && npx playwright test

Deployment:

# Push to main branch → Cloud Build triggers automatically
git push origin main

# Or manual deployment:
gcloud builds submit --config cloudbuild-v5-full.yaml

# ArgoCD auto-syncs (or manual sync):
argocd app sync coditect-prod

Risk Mitigation

Technical Risks

RiskImpactLikelihoodMitigation
FoundationDB OutageCriticalLow5-node replication, automatic failover, daily backups
Pod Provisioning FailuresHighMediumRetry logic, timeout handling, fallback to shared pods
Stripe Payment FailuresHighLowWebhook retry, grace period enforcement
OAuth Provider DowntimeMediumLowSupport email/password fallback
Cluster Auto-Scaling DelaysMediumMediumPre-warm nodes, monitor queue depth
SSL Certificate ExpiryHighLowcert-manager auto-renewal, 30-day expiry alerts

Operational Risks

RiskImpactLikelihoodMitigation
High User Surge (HN front page)HighMediumAuto-scaling (3 → 50 pods), load testing
On-Call FatigueMediumHighPagerDuty rotation, runbooks, automated remediation
Cost OverrunsHighMediumGCP budget alerts, resource quotas, idle pod cleanup
Data LossCriticalLowFDB replication, daily backups to GCS, point-in-time recovery

Success Metrics

MVP Launch (Week 4)

Technical Metrics:

  • ✅ Backend API: 99% uptime
  • ✅ Provisioning: <2 min p95 latency
  • ✅ API Latency: <500ms p95
  • ✅ Error Rate: <1% of requests
  • ✅ Pod Crashes: <5 per day

Business Metrics:

  • 🎯 100+ beta signups (week 1)
  • 🎯 50+ active users (week 2)
  • 🎯 10+ paid conversions (week 4)
  • 🎯 20+ daily active users (week 4)

User Metrics:

  • 🎯 60% signup → workspace provisioning completion
  • 🎯 40% 7-day retention
  • 🎯 30 min average session duration

90-Day Post-Launch

Growth:

  • 🎯 1000+ total signups
  • 🎯 200+ active users
  • 🎯 50+ paying customers
  • 🎯 $2500+ MRR (Monthly Recurring Revenue)

Infrastructure:

  • ✅ 99.9% uptime (SLA)
  • ✅ <1 hour mean time to recovery (MTTR)
  • ✅ Zero data loss incidents
  • ✅ Auto-scaling to 500+ concurrent users

Conclusion

This roadmap provides complete automation for:

  • ✅ User registration with Stripe payment
  • ✅ Automated pod provisioning (Kubernetes operator)
  • ✅ Multi-tenant, multi-session, multi-llm IDE
  • ✅ CI/CD pipeline (Helm + ArgoCD + Cloud Build)
  • ✅ Production monitoring and alerting
  • ✅ Auto-scaling and idle cleanup

Timeline Summary:

  • Phase 0 (Critical): 3-5 days (fix pods, build frontend/theia)
  • Phase 1 (Payments): 5-7 days (Stripe + OAuth)
  • Phase 2 (Provisioning): 7-10 days (K8s operator + idle cleanup)
  • Phase 3 (CI/CD): 7-10 days (Helm + ArgoCD) [parallel with Phase 2]
  • Phase 4 (Operations): 5-7 days (monitoring + alerting) [parallel with Phase 3]
  • Phase 5 (Beta): 30 days (testing + launch)

Total: 4 weeks to MVP + 30 days beta = Production-ready in 8 weeks

With 2-person team: ~6 weeks to MVP With 4-person team: ~4 weeks to MVP


Document Status: ✅ Complete Last Updated: 2025-10-07 Next Review: Weekly during implementation Owner: Engineering Team