Skip to main content

Production Deployment Checklist

Date: 2025-10-26 Status: ✅ READY FOR TESTING Build Configuration: Production-ready with all components


📦 What's Included in This Build

✅ COMPLETED: Configuration Updates

dockerfile.combined-fixed (Updated 2025-10-26):

  • ✅ Stage 1: V5 Frontend (React + Vite)
  • ✅ Stage 2: theia IDE with custom branding and icon themes
  • ✅ Stage 3: V5 Backend API (Rust/Actix-web)
  • ✅ Stage 4: Codi2 Monitoring System (Rust)
  • ✅ Stage 5: File Monitor Service (Rust)
  • ✅ Stage 6: Runtime image with all components

cloudbuild-combined.yaml (Updated 2025-10-26):

  • ✅ E2_HIGHCPU_32 machine type (32 CPUs)
  • ✅ 8GB Node heap (NODE_OPTIONS=--max_old_space_size=8192)
  • ✅ 60-minute timeout
  • ✅ 100GB disk size
  • ✅ StatefulSet deployment (persistent storage)

📋 Production Components

1. Rust Binaries (3 total):

/usr/local/bin/coditect-v5-api    # V5 Backend API (Actix-web + FDB)
/usr/local/bin/codi2 # Monitoring system (WebSocket + MCP)
/usr/local/bin/file-monitor # File system monitoring

2. .coditect Configuration (DUAL LAYER - ~800 KB total):

Base Layer (from claude-code-initial-setup):

/app/.coditect/
├── agents/ # 5 agents (sub-agent orchestration)
├── skills/ # 2 skills (code-editor, deployment)
├── scripts/ # 15 scripts (session management, logging)
└── workflows/ # 3 workflows (multi-agent coordination)

T2 Layer (from T2 project):

/app/.coditect/
├── agents-t2/ # 15 agents (codebase-analyzer, orchestrator, etc.)
├── skills-t2/ # 19 skills (build-deploy-workflow, foundationdb-queries, etc.)
├── commands-t2/ # 64 commands (create_plan, implement_plan, etc.)
└── hooks/ # Pre/post execution hooks

Total: 20 agents, 21 skills, 64 commands, 15+ scripts

3. Debian Packages (31 total):

  • TIER 1: build-essential, jq, wget, tree, htop, vim, nano
  • TIER 2: git-lfs, ripgrep, fzf, tmux, rsync, zip, unzip, silversearcher-ag
  • Additional: nginx, curl, ca-certificates, python3, python3-pip, python3-venv, python3-dev

4. Node.js Global Packages (17 total):

  • TypeScript: typescript, ts-node, @types/node
  • Linting: eslint, prettier
  • Build Tools: vite, esbuild, webpack, webpack-cli
  • Package Managers: pnpm, yarn
  • Development: http-server, nodemon, concurrently
  • React: react-devtools, create-react-app, create-next-app
  • AI Tools: @google/gemini-cli

🚀 Phase 1: Pre-Build Verification

✅ File Verification Checklist

Run these commands to verify all required files exist:

cd /home/hal/v4/PROJECTS/t2

# Check Dockerfile
[ -f dockerfile.combined-fixed ] && echo "✓ Dockerfile present" || echo "✗ Dockerfile missing"

# Check Cloud Build config
[ -f cloudbuild-combined.yaml ] && echo "✓ Cloud Build config present" || echo "✗ Config missing"

# Check Rust source directories
[ -d backend/ ] && echo "✓ V5 API source present" || echo "✗ V5 API missing"
[ -d archive/coditect-v4/codi2/ ] && echo "✓ Codi2 source present" || echo "✗ Codi2 missing"
[ -d src/file-monitor/ ] && echo "✓ File monitor source present" || echo "✗ File monitor missing"

# Check .coditect configs (both layers)
[ -d archive/claude-code-initial-setup/.claude ] && echo "✓ Base .coditect configs present" || echo "✗ Base configs missing"
[ -d .claude/agents ] && [ -d .claude/skills ] && [ -d .claude/commands ] && echo "✓ T2 .coditect configs present" || echo "✗ T2 configs missing"

# Check frontend build artifact
[ -d dist/ ] && echo "✓ Frontend dist/ present" || echo "⚠ Run 'npm run build' first"

# Check theia source
[ -d theia-app/ ] && echo "✓ theia source present" || echo "✗ theia missing"

# Check NGINX config
[ -f nginx-combined.conf ] && echo "✓ NGINX config present" || echo "✗ NGINX config missing"

# Check start script
[ -f start-combined.sh ] && echo "✓ Start script present" || echo "✗ Start script missing"

⚠️ CRITICAL: Build Frontend First

IMPORTANT: Frontend must be built BEFORE submitting Cloud Build:

# Navigate to project root
cd /home/hal/v4/PROJECTS/t2

# Install dependencies (if not already done)
npm install --legacy-peer-deps

# Build frontend (creates dist/ directory)
npm run build

# Verify dist/ exists and contains files
ls -la dist/
# Should show: index.html, assets/, vite.svg

Why this matters: The Dockerfile copies dist/ from your local filesystem. If dist/ doesn't exist or is stale, the deployed frontend will be broken.


🏗️ Phase 2: Cloud Build Submission

Build Submission Command

# Set GCP project
gcloud config set project serene-voltage-464305-n2

# Navigate to project root
cd /home/hal/v4/PROJECTS/t2

# Submit build
gcloud builds submit \
--config cloudbuild-combined.yaml \
--project serene-voltage-464305-n2

Expected Build Timeline

StageDurationDescription
Stage 1: Frontend~2 minAlready built locally, just copied
Stage 2: theia~15-20 minnpm install + webpack build
Stage 3: V5 API~5-8 minRust cargo build --release
Stage 4: Codi2~8-12 minRust cargo build --release --all-features
Stage 5: File Monitor~3-5 minRust cargo build --release --examples
Stage 6: Runtime~3-5 minapt-get install + npm global install
TOTAL30-40 minWith E2_HIGHCPU_32 machine type

Monitor Build Progress

# View builds in console
echo "https://console.cloud.google.com/cloud-build/builds?project=serene-voltage-464305-n2"

# Or monitor via CLI
gcloud builds list --ongoing --limit=5

# View specific build logs
gcloud builds log <BUILD_ID> --stream

🎯 Phase 3: Build Verification

After Build Completes

# Get the BUILD_ID from Cloud Build output
BUILD_ID="<your-build-id>"

# Verify image exists in Artifact Registry
gcloud artifacts docker images list \
us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/coditect-combined \
--include-tags \
--limit=5

# Expected output:
# IMAGE TAGS
# us-central1-docker.pkg.dev/.../coditect-combined latest, <BUILD_ID>

Test Image Locally (Optional)

# Pull the image
docker pull us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/coditect-combined:$BUILD_ID

# Run container locally
docker run --rm -p 8080:80 \
us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/coditect-combined:$BUILD_ID

# Test in browser: http://localhost:8080

Verify Binaries Inside Image

# Check Rust binaries
docker run --rm us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/coditect-combined:$BUILD_ID \
bash -c "which coditect-v5-api codi2 file-monitor"

# Expected output:
# /usr/local/bin/coditect-v5-api
# /usr/local/bin/codi2
# /usr/local/bin/file-monitor

# Check binary versions
docker run --rm us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/coditect-combined:$BUILD_ID \
bash -c "coditect-v5-api --version; codi2 --version; file-monitor --help"

# Check .coditect configs
docker run --rm us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/coditect-combined:$BUILD_ID \
ls -la /app/.coditect/

# Expected output: agents/, skills/, scripts/, workflows/, logs/

🚢 Phase 4: GKE Deployment

Deployment Command

The Cloud Build automatically deploys to GKE StatefulSet. To manually deploy:

# Set kubectl context
gcloud container clusters get-credentials codi-poc-e2-cluster \
--zone us-central1-a \
--project serene-voltage-464305-n2

# Update StatefulSet image
kubectl set image statefulset/coditect-combined \
combined=us-central1-docker.pkg.dev/serene-voltage-464305-n2/coditect/coditect-combined:$BUILD_ID \
--namespace=coditect-app

# Monitor rollout (StatefulSet updates one pod at a time)
kubectl rollout status statefulset/coditect-combined \
--namespace=coditect-app \
--timeout=10m

Verify Deployment

# Check pod status
kubectl get pods -n coditect-app -l app=coditect-combined

# Expected output:
# NAME READY STATUS RESTARTS AGE
# coditect-combined-0 1/1 Running 0 5m
# coditect-combined-1 1/1 Running 0 3m
# coditect-combined-2 1/1 Running 0 1m

# Check pod logs
kubectl logs -f deployment/coditect-combined -n coditect-app

# Check service
kubectl get svc coditect-combined-service -n coditect-app

# Check ingress
kubectl get ingress -n coditect-app

✅ Phase 5: Functionality Testing

Test 1: Frontend Access

# Test via ingress
curl -I https://coditect.ai

# Expected: HTTP/2 200 (or 301 redirect to HTTPS)

# Test frontend loading
curl -s https://coditect.ai | grep "<title>"

# Expected: <title>Coditect AI IDE</title>

Test 2: theia IDE Access

# Test theia endpoint
curl -I https://coditect.ai/theia

# Expected: HTTP/2 200

# Check for theia-specific assets
curl -s https://coditect.ai/theia | grep "theia"

Test 3: Backend API Access

# Test health endpoint
curl https://api.coditect.ai/api/v5/health

# Expected: {"status":"ok","timestamp":"..."}

# Test API version
curl https://api.coditect.ai/api/v5/version

# Expected: {"version":"0.1.0","rust_version":"..."}

Test 4: Rust Binaries Inside Pods

# SSH into pod
POD=$(kubectl get pods -n coditect-app -l app=coditect-combined -o jsonpath='{.items[0].metadata.name}')
kubectl exec -it $POD -n coditect-app -- bash

# Inside pod, test binaries:
which coditect-v5-api
coditect-v5-api --version

which codi2
codi2 --version

which file-monitor
file-monitor --help

# Check .coditect configs (both layers)
ls -la /app/.coditect/

# Base layer
echo "Base layer configs:"
ls -1 /app/.coditect/agents/*.md | wc -l # Should show 5
ls -1d /app/.coditect/skills/*/ | wc -l # Should show 2

# T2 layer
echo "T2 layer configs:"
ls -1 /app/.coditect/agents-t2/*.md | wc -l # Should show 15
ls -1d /app/.coditect/skills-t2/*/ | wc -l # Should show 19
ls -1 /app/.coditect/commands-t2/*.md | wc -l # Should show 64

# Total check
cat /app/.coditect/agents/README.md || cat /app/.coditect/agents-t2/README.md

# Exit pod
exit

Test 5: Development Tools Availability

# SSH into pod
kubectl exec -it $POD -n coditect-app -- bash

# Test Debian packages
which jq ripgrep fzf tmux tree htop vim

# Test npm global packages
which tsc eslint prettier vite webpack

# Test Python
python3 --version
pip3 --version

# Exit pod
exit

🔍 Troubleshooting

Build Failures

Issue: Rust compilation timeout

  • Fix: Already using E2_HIGHCPU_32 (32 CPUs), build should complete in 30-40 min
  • Alternative: Increase timeout in cloudbuild-combined.yaml (currently 60 min)

Issue: Out of memory during build

  • Fix: Already set NODE_OPTIONS=--max_old_space_size=8192 (8GB)
  • Check: Verify machine type is E2_HIGHCPU_32 in build logs

Issue: Docker layer limit exceeded (127 layers)

  • Fix: Dockerfile already optimized with combined RUN statements
  • Check: Count layers with docker history <image>

Deployment Failures

Issue: Image pull errors

  • Fix: Verify image exists in Artifact Registry
  • Check: gcloud artifacts docker images list ...

Issue: StatefulSet not rolling out

  • Fix: Check pod events with kubectl describe pod <pod-name> -n coditect-app
  • Common causes: Insufficient resources, failing health checks, missing volumes

Issue: Health check failures

  • Fix: Check NGINX config and start script
  • Verify: curl http://localhost/health from inside pod

Functionality Issues

Issue: Rust binaries not found

  • Fix: Verify COPY commands in Dockerfile
  • Check: Binary paths should be /usr/local/bin/coditect-v5-api, etc.

Issue: .coditect configs missing

  • Fix: Verify archive/claude-code-initial-setup/.claude exists before build
  • Check: ls -la archive/claude-code-initial-setup/.claude

Issue: Development tools missing

  • Fix: Verify apt-get and npm install commands succeeded in build logs
  • Check: Search build logs for "npm install -g" and verify no errors

📊 Success Criteria

Build Success

  • ✅ Build completes in 30-40 minutes (not 60+)
  • ✅ No errors in build logs
  • ✅ Image size ~4.5 GB (includes dev tools)
  • ✅ Image pushed to Artifact Registry with BUILD_ID and latest tags

Deployment Success

  • ✅ All 3 StatefulSet pods Running (coditect-combined-0/1/2)
  • ✅ Health checks passing (30s interval, 3 retries)
  • ✅ Ingress routing correctly (coditect.ai → frontend, /theia → theia)
  • ✅ No CrashLoopBackOff or ImagePullBackOff errors

Functionality Success

  • ✅ Frontend loads at https://coditect.ai
  • ✅ theia IDE loads at https://coditect.ai/theia
  • ✅ Backend API responds at https://api.coditect.ai/api/v5/health
  • ✅ All 3 Rust binaries functional inside pods
  • ✅ .coditect configs accessible at /app/.coditect/ (both base and T2 layers)
    • Base layer: 5 agents, 2 skills, 15 scripts
    • T2 layer: 15 agents, 19 skills, 64 commands
  • ✅ All 31 Debian packages installed and working
  • ✅ All 17 npm global packages available

No Regressions

  • ✅ Existing frontend functionality unchanged
  • ✅ theia icon themes still working (vs-seti)
  • ✅ Custom Coditect AI branding preserved
  • ✅ NGINX routing unchanged
  • ✅ StatefulSet persistent volumes still mounted

📝 Next Steps After Successful Deployment

  1. Monitor Production Performance

    • Watch pod resource usage: kubectl top pods -n coditect-app
    • Check logs for errors: kubectl logs -f -l app=coditect-combined -n coditect-app
    • Monitor user access patterns
  2. Test User Workflows

    • Create test user account
    • Launch theia IDE
    • Test file monitor (watch file changes)
    • Test Codi2 monitoring (audit logging)
  3. Documentation Updates

    • Update CLAUDE.md with new binary locations
    • Document .coditect usage in user guide
    • Create runbook for binary management
  4. Future Optimizations

    • Consider caching Rust build artifacts (reduce build time)
    • Optimize Docker layer count further
    • Implement health check endpoints for binaries
    • Add monitoring for Rust services

Status: ✅ CONFIGURATION COMPLETE - READY FOR BUILD SUBMISSION Next Action: Run Phase 1 verification, then submit Cloud Build Expected Completion: 30-40 minutes after submission