Skip to main content

Context Awareness

Google Cloud Build Skill - Coditect Modules

How to Use This Skill

  1. Review the patterns and examples below
  2. Apply the relevant patterns to your implementation
  3. Follow the best practices outlined in this skill

Purpose: Successfully build and deploy Coditect modules (backend, frontend+Theia combined) to Google Cloud Platform using Cloud Build.

When to Use

Use this skill when:

  • Deploying backend API (Rust/Actix-web) to GKE
  • Deploying combined frontend+Theia to GKE
  • Troubleshooting failed Cloud Build deployments
  • Optimizing build times and upload sizes
  • Setting up new Coditect modules for GCP deployment

Core Capabilities

  • Backend Deployment: Rust/Actix-web API builds with E2_HIGHCPU_8 machine type (~6 min)
  • Frontend+Theia Deployment: Combined builds with E2_HIGHCPU_32 for parallel webpack (~10-15 min)
  • Build Optimization: .gcloudignore support reducing upload from 13,698 to 8,623 files (5-10 min savings)
  • Troubleshooting: Comprehensive error recovery for 7+ common build failures
  • GKE Integration: kubectl deployment automation with pod rollout verification
  • Safety Verification: Deployment name validation, rollout timeout management, pod health checks

Pre-Flight Checklist

ALWAYS verify these before gcloud builds submit:

# 1. Check authentication
gcloud auth list
# If expired: gcloud auth login

# 2. Verify required files exist
ls -1 Dockerfile* cloudbuild*.yaml nginx*.conf start*.sh dist/

# 3. Check .gcloudignore exists (saves 5-10 min upload time!)
ls -la .gcloudignore

# 4. For combined builds: Frontend must be built first
ls -lh dist/
# Should show dist/assets/ and dist/index.html with recent timestamp

# 5. Verify Dockerfile COPY commands (no wildcards!)
grep "COPY.*\*" Dockerfile*
# Should return empty - wildcards fail in Cloud Build context

# 6. Verify deployment name matches existing K8s deployment
kubectl get deployment -n coditect-app
# Match the name exactly in cloudbuild.yaml (e.g., coditect-combined, not coditect-combined-v5)

Module Build Patterns

1. Backend API (Rust/Actix-web)

Directory: backend/ Config: backend/cloudbuild-gke.yaml Machine: E2_HIGHCPU_8 (sufficient for Rust compilation) Build Time: ~6 minutes

Command:

cd backend
gcloud builds submit --config cloudbuild-gke.yaml .

Key Configuration:

# backend/cloudbuild-gke.yaml
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'IMAGE:$BUILD_ID', '-f', 'Dockerfile', '.']

- name: 'gcr.io/cloud-builders/kubectl'
args: ['set', 'image', 'deployment/coditect-api-v5', 'api=IMAGE:$BUILD_ID']
env:
- 'CLOUDSDK_COMPUTE_ZONE=us-central1-a'
- 'CLOUDSDK_CONTAINER_CLUSTER=codi-poc-e2-cluster'

options:
machineType: 'E2_HIGHCPU_8'

Common Issues:

  • ❌ Wrong container name in kubectl set image
  • ❌ Missing FDB environment variables
  • ✅ Fix: Match container name in deployment YAML

2. Combined Frontend+Theia

Directory: / (project root) Config: cloudbuild-combined.yaml Machine: E2_HIGHCPU_32 (32 CPUs for Theia webpack) Build Time: ~10-15 minutes

Prerequisites:

# 1. Build V5 frontend first (CRITICAL!)
npx vite build # Creates dist/ folder (~1.3 MB)

# 2. Verify dist/ exists and is fresh
ls -lh dist/
# dist/assets/index-*.js should be recent timestamp

# 3. Check .gcloudignore to reduce upload size
cat .gcloudignore # Should exclude node_modules/, docs/, tests/

Command:

# From project root
gcloud builds submit --config cloudbuild-combined.yaml .

Key Configuration:

# cloudbuild-combined.yaml
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-f', 'Dockerfile.local-test', '-t', 'IMAGE:$BUILD_ID', '.']

options:
machineType: 'E2_HIGHCPU_32' # 32 CPUs for Theia
diskSizeGb: 100
env:
- 'NODE_OPTIONS=--max_old_space_size=8192' # 8GB heap for webpack

Dockerfile Pattern (Dockerfile.local-test):

# Stage 1: Build Theia
FROM node:20 AS theia-builder
WORKDIR /app/theia

# ⚠️ IMPORTANT: Explicit file list (NO wildcards!)
COPY theia-app/package*.json ./
COPY theia-app/tsconfig.json ./
COPY theia-app/gen-webpack.config.js ./
COPY theia-app/gen-webpack.node.config.js ./
COPY theia-app/webpack.config.js ./

RUN npm install
COPY theia-app/src ./src
COPY theia-app/plugins ./plugins

ENV NODE_OPTIONS="--max_old_space_size=8192"
RUN npm run prepare

# Stage 2: Runtime
FROM node:20-slim
COPY dist /app/v5-frontend # Pre-built frontend
COPY --from=theia-builder /app/theia /app/theia
COPY nginx-combined.conf /etc/nginx/sites-available/default
COPY start-combined.sh /start.sh

Optimization Techniques

1. .gcloudignore File (Saves 5-10 minutes!)

Impact: Reduces upload from 13,698 files (1.6 GB) to 8,623 files (1.5 GB)

Create: .gcloudignore in project root

# Exclude heavy/unnecessary files
.git/
node_modules/
docs/
thoughts/
archive/
coverage/
*.test.ts
*.test.tsx
*.log
.vscode/
.idea/

# Keep these for the build
!dist/
!theia-app/
!backend/
!nginx-combined.conf
!start-combined.sh

2. Pre-Build Frontend Locally

Why: Cloud Build doesn't need to build frontend (already done locally) Benefit: Saves ~2 minutes, smaller Docker image

# Build locally before gcloud builds submit
npx vite build # ~21 seconds
# Then Dockerfile just copies dist/ folder

3. Use Proven Machine Types

ModuleMachine TypeCPUsRAMWhy
BackendE2_HIGHCPU_888 GBRust compilation
CombinedE2_HIGHCPU_323232 GBTheia webpack (parallel builds)

Don't use:

  • ❌ N1_HIGHCPU_8 (older generation, slower)
  • ❌ E2_HIGHCPU_8 for Theia (too slow, webpack timeouts)

4. Explicit Docker COPY (No Wildcards!)

❌ FAILS in Cloud Build context:

COPY theia-app/*.config.js ./
# Error: "COPY failed: no source files were specified"

✅ WORKS:

COPY theia-app/gen-webpack.config.js ./
COPY theia-app/gen-webpack.node.config.js ./
COPY theia-app/webpack.config.js ./

Why: Cloud Build's Docker context differs from local Docker - wildcards don't expand correctly.

Common Build Failures & Fixes

Error 1: "COPY failed: no source files were specified"

Symptom:

Step 5/22 : COPY theia-app/*.config.js ./
COPY failed: no source files were specified

Root Cause: Wildcard patterns in COPY don't work in Cloud Build context

Fix: Use explicit file lists

# Before
COPY theia-app/*.config.js ./

# After
COPY theia-app/gen-webpack.config.js ./
COPY theia-app/gen-webpack.node.config.js ./
COPY theia-app/webpack.config.js ./

Error 2: "no such file or directory: dist/"

Symptom:

COPY dist /app/v5-frontend
COPY failed: stat dist: file does not exist

Root Cause: Frontend not built before Docker build

Fix: Build frontend first

# Build V5 frontend
npx vite build

# Verify dist/ exists
ls -lh dist/

# Then run Cloud Build
gcloud builds submit --config cloudbuild-combined.yaml .

Error 3: Upload taking 10+ minutes

Symptom:

Creating temporary archive of 13698 file(s) totalling 1.6 GiB...
(hangs for 10+ minutes)

Root Cause: No .gcloudignore file - uploading unnecessary files (node_modules, docs, tests)

Fix: Create .gcloudignore (see Optimization section above)

Error 4: Webpack out of memory during Theia build

Symptom:

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

Root Cause: Insufficient Node.js heap size or machine type

Fix: Use E2_HIGHCPU_32 with 8GB heap

# cloudbuild-combined.yaml
options:
machineType: 'E2_HIGHCPU_32'
env:
- 'NODE_OPTIONS=--max_old_space_size=8192'

Error 5: Wrong container name in kubectl

Symptom:

error: unable to find container named "api-server"

Root Cause: Container name mismatch between cloudbuild.yaml and k8s deployment

Fix: Match container name exactly

# Get actual container name from deployment
kubectl get deployment coditect-api-v5 -n coditect-app -o yaml | grep "name:"

# Update cloudbuild-gke.yaml to match
- 'api=IMAGE:$BUILD_ID' # NOT api-server=

Error 6: Deployment not found (wrong name)

Symptom:

Error from server (NotFound): deployments.apps "coditect-combined-v5" not found

Root Cause: cloudbuild.yaml references a deployment name that doesn't exist in the cluster

Fix: Check existing deployments and match the name

# 1. List all deployments
kubectl get deployment -n coditect-app

# 2. Find the correct deployment name (e.g., coditect-combined, not coditect-combined-v5)
# Output shows: coditect-combined (not coditect-combined-v5)

# 3. Update cloudbuild-combined.yaml to match
# Change: deployment/coditect-combined-v5
# To: deployment/coditect-combined

Key lesson: Always verify deployment exists before using kubectl set image. Use kubectl get deployment -n <namespace> to list actual names.

Error 7: Rollout timeout (not always a failure!)

Symptom:

error: timed out waiting for the condition
BUILD FAILURE: Build step failure: build step 4 "gcr.io/cloud-builders/kubectl" failed

Root Cause: Rollout verification timeout (default 5 minutes) - pods might still be starting successfully

Fix: Distinguish between timeout and actual failure

# 1. Check if image was updated (even if verification timed out)
kubectl get deployment coditect-combined -n coditect-app -o yaml | grep "image:"

# 2. Check pod status
kubectl get pods -n coditect-app -l app=coditect-combined

# 3. If pods are Running, deployment succeeded (just slow)
# If pods are CrashLoopBackOff or Error, investigate with:
kubectl describe pod -n coditect-app <pod-name>
kubectl logs -n coditect-app <pod-name>

# 4. For slow-starting containers (like Theia), increase timeout
# In cloudbuild.yaml:
- name: 'gcr.io/cloud-builders/kubectl'
args: ['rollout', 'status', 'deployment/X', '--timeout=10m'] # Increase from 5m

Key distinction:

  • Timeout during verification - Image updated, pods starting (may still succeed)
  • Actual deployment failure - Pods in CrashLoopBackOff or Error state

Troubleshooting Workflow

When a build fails:

# 1. Get Build ID from error message
BUILD_ID="<id from error>"

# 2. View detailed logs
gcloud builds log $BUILD_ID | tail -100

# 3. Identify failure step
# Look for "Step #N" and "ERROR" lines

# 4. Common checks based on step:
# - Step 0 (build-image): Docker/Dockerfile issue
# - Step 1-2 (push): Registry permissions
# - Step 3-4 (deploy-gke): kubectl/GKE connectivity

# 5. Check Cloud Console for visual logs
echo "https://console.cloud.google.com/cloud-build/builds/$BUILD_ID"

# 6. Fix and retry
gcloud builds submit --config <config.yaml> .

Real-World Examples

Example 1: Combined Build Success (Oct 18, 2025)

Scenario: Deploy combined frontend+Theia after multiple failures

Journey (5 attempts):

  1. ❌ Attempt #1: Docker wildcard COPY error (COPY theia-app/*.config.js failed)
  2. ❌ Attempt #2: Slow upload (13,698 files, 10+ min without .gcloudignore)
  3. ❌ Attempt #3: .gcloudignore too aggressive (excluded needed webpack configs)
  4. ❌ Attempt #4: Build success, deployment name mismatch (coditect-combined-v5 not found)
  5. ⚠️ Attempt #5: Build success, image pushed, rollout timeout (but image updated!)

What Went Well (Attempt #5):

  • ✅ Docker image built successfully (Theia webpack compiled with 32 CPUs)
  • ✅ Image pushed to Artifact Registry (both BUILD_ID and latest tags)
  • ✅ Deployment image updated (kubectl set image succeeded)
  • ✅ Upload optimized: 8,627 files (1.5 GB) in ~2 minutes
  • ✅ Theia webpack compiled: 11 MB frontend + 9.34 MB backend bundles
  • ✅ Node.js 8GB heap prevented OOM errors
  • ✅ Explicit COPY file lists worked perfectly

Key Fixes Applied:

# 1. Fix Dockerfile wildcards (explicit file list)
COPY theia-app/gen-webpack.config.js ./
COPY theia-app/gen-webpack.node.config.js ./
COPY theia-app/webpack.config.js ./

# 2. Create optimized .gcloudignore with explicit inclusions
!theia-app/
!theia-app/**/*.config.js
!dist/
!vite.config.ts

# 3. Build frontend first
npx vite build

# 4. Fix deployment name (match existing K8s deployment)
kubectl get deployment -n coditect-app # Shows: coditect-combined
# Changed cloudbuild.yaml from coditect-combined-v5 → coditect-combined

# 5. Deploy with corrected config
gcloud builds submit --config cloudbuild-combined.yaml .

Build Metrics:

  • Upload: 2 min (8,627 files optimized)
  • Docker build: 7 min (Theia webpack compilation)
  • Image push: 2 min (BUILD_ID + latest tags)
  • Deployment: Image updated (rollout verification timed out, but likely still starting)
  • Total Cloud Build time: ~12 minutes

Status: Build SUCCESS ✅ - Image deployed to cluster, pods likely starting (rollout timeout is verification step, not build failure)

Next step: Check pod status with kubectl get pods -n coditect-app -l app=coditect-combined to confirm healthy startup

Example 2: Backend Deployment (Oct 18, 2025)

Scenario: Deploy Rust backend API with FDB fixes

Command:

cd backend
gcloud builds submit --config cloudbuild-gke.yaml .

Build Time: ~6 minutes Result: ✅ Success - API deployed and tested with JWT auth

Automation Script

#!/bin/bash
# deploy-coditect-module.sh - Smart deployment script

MODULE=$1 # "backend" or "combined"

if [ "$MODULE" == "combined" ]; then
echo "=== Combined Frontend+Theia Deployment ==="

# Pre-flight checks
echo "1. Checking frontend build..."
if [ ! -d "dist" ] || [ -z "$(find dist -name 'index-*.js' -mtime -1)" ]; then
echo " Building frontend..."
npx vite build || exit 1
else
echo " ✅ Frontend already built (recent)"
fi

echo "2. Checking .gcloudignore..."
if [ ! -f ".gcloudignore" ]; then
echo " ⚠️ No .gcloudignore - upload will be slow!"
read -p " Continue anyway? (y/n) " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then exit 1; fi
else
echo " ✅ .gcloudignore exists"
fi

echo "3. Starting Cloud Build..."
gcloud builds submit --config cloudbuild-combined.yaml .

elif [ "$MODULE" == "backend" ]; then
echo "=== Backend API Deployment ==="

echo "1. Starting Cloud Build..."
cd backend
gcloud builds submit --config cloudbuild-gke.yaml .

else
echo "Usage: $0 [backend|combined]"
exit 1
fi

Usage:

chmod +x scripts/deploy-coditect-module.sh
./scripts/deploy-coditect-module.sh combined # Deploy frontend+Theia
./scripts/deploy-coditect-module.sh backend # Deploy backend API

Tips & Best Practices

  1. Always build frontend first for combined deployments (npx vite build)
  2. Use .gcloudignore to exclude node_modules/, docs/, tests/ (saves 5-10 min)
  3. Avoid Docker COPY wildcards - use explicit file lists
  4. Match machine type to workload:
    • Backend: E2_HIGHCPU_8
    • Combined (Theia): E2_HIGHCPU_32
  5. Set NODE_OPTIONS for Theia builds: --max_old_space_size=8192
  6. Monitor builds with gcloud builds log <BUILD_ID>
  7. Use deployment archeology to find previous successful configs
  8. Keep cloudbuild.yaml DRY - use substitutions for repeated values
  9. Verify deployment names match existing K8s resources (kubectl get deployment -n <namespace>)
  10. Rollout timeout ≠ failure - Check pod status to confirm actual deployment health
  11. Test .gcloudignore before submitting: gcloud meta list-files-for-upload | grep <pattern>
  12. Increase rollout timeout for slow-starting containers (Theia: 10m instead of 5m)

Integration with Other Skills

  • deployment-archeology: Find previous successful build configs
  • codebase-locator: Find Dockerfiles and cloudbuild configs
  • web-search-researcher: Research Cloud Build error messages

Success Output

When this skill is successfully applied, you should see:

✅ SKILL COMPLETE: google-cloud-build

Completed:
- [x] Frontend built locally (dist/ created, <2 min)
- [x] .gcloudignore created (8,627 files vs 13,698 baseline)
- [x] Docker image built successfully (Theia webpack compiled)
- [x] Image pushed to Artifact Registry (BUILD_ID + latest tags)
- [x] GKE deployment updated (kubectl set image succeeded)
- [x] Pod rollout verified (3/3 pods Running)

Outputs:
- Docker image: us-central1-docker.pkg.dev/PROJECT_ID/coditect/MODULE:BUILD_ID
- GKE deployment: Updated deployment/coditect-combined
- Build logs: https://console.cloud.google.com/cloud-build/builds/BUILD_ID
- Pod status: kubectl get pods -n coditect-app (3/3 Running)
- Build time: ~12 minutes (frontend+Theia combined)

Performance Metrics:
- Upload time: ~2 min (optimized with .gcloudignore)
- Docker build: ~7 min (Theia webpack with 32 CPUs)
- Image push: ~2 min (BUILD_ID + latest tags)
- Deployment: Image updated (pods may still be starting)

Completion Checklist

Before marking this skill as complete, verify:

  • Frontend built locally (npx vite build completed, dist/ exists)
  • .gcloudignore file exists (excludes node_modules/, docs/, tests/)
  • Upload size optimized (<10,000 files, ~1.5 GB)
  • Docker image built (no wildcard COPY errors)
  • Theia webpack compiled (11 MB frontend + 9.34 MB backend bundles)
  • Image pushed to Artifact Registry (both BUILD_ID and latest tags)
  • Deployment name matches existing K8s resource (kubectl get deployment verified)
  • Pods starting or Running (kubectl get pods shows progress)
  • Build logs accessible (gcloud builds log BUILD_ID or Console URL)
  • No critical errors in build steps (all steps green or yellow with explanation)

Failure Indicators

This skill has FAILED if:

  • ❌ Docker COPY wildcards failed ("no source files were specified")
  • ❌ Frontend dist/ missing (COPY dist failed)
  • ❌ Upload taking >10 min (no .gcloudignore)
  • ❌ Theia webpack out of memory (insufficient heap size or machine type)
  • ❌ Wrong container name in kubectl set image (container not found)
  • ❌ Deployment not found (name mismatch with K8s)
  • ❌ Pods in CrashLoopBackOff or Error state
  • ❌ Build steps failed (red X in Cloud Build console)
  • ❌ Image not in Artifact Registry after successful build

When NOT to Use

Do NOT use this skill when:

  • Local Docker builds - For local testing, use docker build directly
  • Non-GCP infrastructure - AWS/Azure have different CI/CD patterns
  • Simple backend-only - Backend builds don't need 32 CPUs, use E2_HIGHCPU_8
  • Frontend-only static sites - Use Netlify/Vercel, not Cloud Build + GKE
  • Development branches - Cloud Build costs money, test locally first
  • Quick hotfixes - Small changes don't justify 12-minute build cycle
  • Non-containerized apps - Cloud Build is for Docker, use App Engine for non-Docker

Use alternatives:

  • Local testingdocker build -f Dockerfile.local-test .
  • Backend-only → Use backend-specific cloudbuild-gke.yaml (6 min, cheaper)
  • Frontend static → Netlify deploy (faster, cheaper for static sites)
  • Development → Local dev server (npm run dev, cargo run)

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
No .gcloudignoreUpload 13,698 files (10+ min)Create .gcloudignore, exclude node_modules/, docs/, tests/
Wildcards in COPYCOPY theia-app/*.config.js failsUse explicit file lists: COPY theia-app/gen-webpack.config.js ./
No frontend pre-buildCloud Build runs vite build (+2 min)Run npx vite build locally, COPY dist/ in Dockerfile
Wrong machine typeE2_HIGHCPU_8 for Theia (OOM errors)Use E2_HIGHCPU_32 for Theia, E2_HIGHCPU_8 for backend
Small Node heapWebpack OOM during Theia buildSet NODE_OPTIONS=--max_old_space_size=8192
Deployment name mismatchkubectl fails (coditect-combined-v5 not found)Verify with kubectl get deployment -n coditect-app first
Rollout timeout = failureBuild marked failed, but pods actually startingCheck pod status, distinguish timeout from actual failure
No build archeologyRepeating past mistakesReview previous successful builds for config patterns

Principles

This skill embodies CODITECT automation principles:

#1 Recycle → Extend → Re-Use → Create

  • Recycle successful builds - Copy cloudbuild.yaml from working deployments
  • Extend machine types - Use proven E2_HIGHCPU_32 for Theia, not experimental sizes
  • Re-use .gcloudignore patterns - Standard exclusions (node_modules/, *.log, etc.)
  • Create custom configs - Only when standard patterns don't work

#2 First Principles Thinking

  • Understand build context - Cloud Build != local Docker (wildcard behavior differs)
  • Know resource limits - 8GB heap for webpack, 32 CPUs for parallel builds
  • Measure build time - Optimize bottlenecks (upload, webpack, image push)

#5 Eliminate Ambiguity

  • Explicit file paths - No wildcards, list each file individually
  • Clear deployment names - Match K8s exactly (coditect-combined, not coditect-combined-v5)
  • Unambiguous machine type - E2_HIGHCPU_32 is specific, "powerful machine" is vague

#6 Clear, Understandable, Explainable

  • Build logs accessible - Provide Console URL for visual debugging
  • Step-by-step validation - Pre-flight checklist prevents common errors
  • Error context - Explain why wildcard COPY fails (Cloud Build context differs from local)

#8 No Assumptions

  • Verify deployment exists - Don't assume coditect-combined-v5 is in cluster
  • Test .gcloudignore - Use gcloud meta list-files-for-upload to preview
  • Check rollout status - Pod timeout doesn't mean deployment failed

#10 Automation First

  • Automated pre-flight checks - Script verifies dist/, .gcloudignore, deployment name
  • Auto-upload optimization - .gcloudignore reduces files by 37%
  • Auto-retry on transient errors - Cloud Build handles network glitches

Multi-Context Window Support

This skill supports long-running GCP deployment tasks across multiple context windows using Claude 4.5's enhanced state management capabilities.

State Tracking

Checkpoint State (JSON):

{
"checkpoint_id": "gcp_deploy_20251129_151500",
"deployment_target": "coditect-combined",
"build_stages": [
{"stage": "frontend_build", "status": "complete", "duration_sec": 21},
{"stage": "theia_webpack", "status": "complete", "duration_sec": 420},
{"stage": "docker_image", "status": "complete", "build_id": "abc123"},
{"stage": "gke_deploy", "status": "in_progress", "pods_ready": "1/3"}
],
"optimizations_applied": [
".gcloudignore created (8627 files vs 13698)",
"E2_HIGHCPU_32 machine type",
"NODE_OPTIONS=--max_old_space_size=8192"
],
"upload_size_mb": 1500,
"total_build_time_min": 12,
"token_usage": 8000,
"created_at": "2025-11-29T15:15:00Z"
}

Progress Notes (Markdown):

# GCP Cloud Build Progress - 2025-11-29

## Completed Stages
- Frontend build: ✅ 21 seconds (dist/ created, 1.3 MB)
- .gcloudignore optimization: ✅ Reduced upload from 13698 to 8627 files
- Theia webpack: ✅ 7 minutes (11 MB frontend + 9.34 MB backend bundles)
- Docker image build: ✅ Build ID abc123, pushed to Artifact Registry

## In Progress
- GKE deployment: Pods starting (1/3 ready)
- Rollout status: Image updated successfully
- Waiting for pod health checks

## Optimizations Applied
- Used E2_HIGHCPU_32 (32 CPUs for Theia parallel builds)
- Set NODE_OPTIONS to 8GB heap (prevents OOM)
- Explicit COPY file lists (no wildcards)
- Pre-built frontend locally (saves 2 min in Cloud Build)

## Next Actions
- Monitor pod rollout completion (2/3 pods remaining)
- Verify pod health with kubectl get pods
- Test deployed application endpoints
- Document successful build configuration

Session Recovery

When starting a fresh context window after GCP deployment work:

  1. Load Checkpoint State: Read .coditect/checkpoints/gcp-deploy-latest.json
  2. Review Progress Notes: Check gcp-deployment-progress.md for build stage status
  3. Check Cloud Build Status: gcloud builds list --limit=1
  4. Verify GKE Pod Status: kubectl get pods -n coditect-app
  5. Resume Operations: Continue from last completed build stage

Recovery Commands:

# 1. Check latest GCP deployment checkpoint
cat .coditect/checkpoints/gcp-deploy-latest.json | jq '.build_stages'

# 2. Review progress notes
tail -30 gcp-deployment-progress.md

# 3. Check Cloud Build status
gcloud builds list --limit=5

# 4. Verify GKE deployment
kubectl get deployment -n coditect-app
kubectl get pods -n coditect-app -l app=coditect-combined

# 5. Check image in registry
gcloud artifacts docker images list us-central1-docker.pkg.dev/PROJECT_ID/coditect

State Management Best Practices

Checkpoint Files (JSON Schema):

  • Store in .coditect/checkpoints/gcp-deploy-{timestamp}.json
  • Track build stages completed vs in-progress with duration metrics
  • Record Cloud Build IDs for troubleshooting and rollback
  • Include optimization flags applied (.gcloudignore, machine type, etc.)
  • Document GKE pod readiness status for deployment verification

Progress Tracking (Markdown Narrative):

  • Maintain gcp-deployment-progress.md with stage-level status
  • Document build optimization decisions (machine type, heap size, file exclusions)
  • Note build failures and resolution strategies applied
  • List pod health check results and rollout status
  • Track deployment configuration changes with file:line references

Git Integration:

  • Create checkpoint after each successful build stage
  • Commit cloudbuild.yaml changes with descriptive deployment tags
  • Use conventional commits: ci(gcp): Optimize Theia build with E2_HIGHCPU_32
  • Tag successful deployments: git tag deploy-combined-build-abc123

Progress Checkpoints

Natural Breaking Points:

  1. After frontend build completes locally
  2. After .gcloudignore optimization (pre-upload)
  3. After Docker image build and push succeed
  4. After GKE deployment image update
  5. After pod rollout verification completes

Checkpoint Creation Pattern:

# Automatic checkpoint creation at critical phases
if build_stages_complete > 0 or total_build_time_min > 5:
create_checkpoint({
"stages": build_stage_status,
"optimizations": optimizations_list,
"build_metrics": {
"upload_mb": upload_size,
"build_time_min": total_time
},
"tokens": current_token_usage
})

Example: Multi-Context GCP Deployment

Context Window 1: Build Preparation + Docker Image

{
"checkpoint_id": "gcp_docker_complete",
"phase": "docker_image_built",
"stages_complete": ["frontend_build", "gcloudignore_setup", "theia_webpack", "docker_build"],
"build_id": "abc123",
"image_pushed": true,
"next_action": "Deploy to GKE and verify pods",
"token_usage": 6000
}

Context Window 2: GKE Deployment + Verification

# Resume from checkpoint
cat .coditect/checkpoints/gcp_docker_complete.json

# Continue with GKE deployment
# (Context restored in 1 minute vs 10 minutes from scratch)

# Complete deployment and verification
{
"checkpoint_id": "gcp_deploy_complete",
"phase": "deployment_verified",
"all_pods_ready": true,
"endpoints_tested": true,
"rollout_success": true,
"token_usage": 3000
}

Token Savings: 6000 (first context) + 3000 (second context) = 9000 total vs. 15000 without checkpoint = 40% reduction

See docs/CLAUDE-4.5-BEST-PRACTICES.md for complete multi-context patterns.


See Also