Skip to main content

Deployment Summary - December 1, 2025

Executive Summary

Successfully resolved 404 error on /api/v1/auth/register/ endpoint after Gateway API migration. The endpoint is now accessible at https://auth.coditect.ai/api/v1/auth/register/ and correctly routes through the GKE Gateway to the backend service.

Status: ✅ Routing Issue RESOLVED Current State: Endpoint accessible but returns 500 (expected - infrastructure not fully configured) Deployed Version: v1.0.6-staging


Problem Statement

After migrating from GKE Ingress to Gateway API, the registration endpoint returned a 404 error despite correct DNS configuration and Gateway routing rules. Users attempting to register received "Not Found" errors.

Initial Error:

HTTP 404 Not Found
GET https://auth.coditect.ai/api/v1/auth/register/

Root Causes Identified

1. Missing View Files in Docker Image

Problem: The deployed Docker image (v1.0.3-staging) was built before auth.py, subscription.py, license_delivery.py, and license_seat.py were created or committed to the repository.

Discovery:

kubectl exec -n coditect-staging pod/xxx -- ls -la /app/api/v1/views/
# Only showed: __init__.py, license.py
# Missing: auth.py, subscription.py, license_delivery.py, license_seat.py

Impact: Django couldn't find the UserRegistrationView class, resulting in 404 errors.

2. Import Error in Code

Problem: api/v1/views/license.py was importing HEARTBEAT_SCRIPT but the actual constant in licenses/redis_scripts.py was named REFRESH_HEARTBEAT_SCRIPT.

Error Log:

ImportError: cannot import name 'HEARTBEAT_SCRIPT' from 'licenses.redis_scripts'

Impact: Prevented successful deployment of new code - pods crashed on startup with import errors.

3. Docker Platform Mismatch

Problem: Initial Docker builds were for the wrong platform architecture (likely arm64 from local machine), causing GKE nodes to fail pulling the image.

Error:

ImagePullBackOff: no match for platform in manifest: not found

Impact: Deployment failed even after fixing import errors.


Solutions Implemented

Solution 1: Fix Import Error

Files Modified:

  • api/v1/views/license.py (lines 38 and 83)

Changes:

# Before:
from licenses.redis_scripts import (
ACQUIRE_SEAT_SCRIPT,
RELEASE_SEAT_SCRIPT,
HEARTBEAT_SCRIPT, # ❌ Wrong name
GET_ACTIVE_SESSIONS_SCRIPT,
)
heartbeat_sha = redis_client.script_load(HEARTBEAT_SCRIPT)

# After:
from licenses.redis_scripts import (
ACQUIRE_SEAT_SCRIPT,
RELEASE_SEAT_SCRIPT,
REFRESH_HEARTBEAT_SCRIPT, # ✅ Correct name
GET_ACTIVE_SESSIONS_SCRIPT,
)
heartbeat_sha = redis_client.script_load(REFRESH_HEARTBEAT_SCRIPT)

Solution 2: Rebuild Docker Image with Correct Platform

Command:

docker buildx build --platform linux/amd64 --no-cache \
-t us-central1-docker.pkg.dev/coditect-cloud-infra/coditect-backend/coditect-cloud-backend:v1.0.6-staging \
--push .

Key Flags:

  • --platform linux/amd64 - Ensures GKE compatibility
  • --no-cache - Forces inclusion of all current files
  • --push - Automatically pushes to Artifact Registry

Solution 3: Deploy New Image to GKE

Command:

kubectl set image deployment/coditect-backend \
backend=us-central1-docker.pkg.dev/coditect-cloud-infra/coditect-backend/coditect-cloud-backend:v1.0.6-staging \
-n coditect-staging

Verification:

kubectl rollout status deployment/coditect-backend -n coditect-staging
# deployment "coditect-backend" successfully rolled out

kubectl exec -n coditect-staging pod/xxx -- ls -la /app/api/v1/views/
# ✅ All files present: auth.py, subscription.py, license_delivery.py, license_seat.py

Current Status

✅ Problems Resolved

  1. 404 Error Fixed - Endpoint now found and executing
  2. Import Errors Fixed - Code deploys without crashes
  3. All View Files Present - All authentication and license management files deployed
  4. Pods Running Stably - No crashes, passing health checks
  5. Gateway Routing Working - Traffic correctly routes from Gateway → Backend

Expected 500 Error (Infrastructure Not Configured)

The endpoint now returns a 500 error with the message:

{"error": "Registration failed. Please try again."}

Pod Logs Show Expected Errors:

{"level": "ERROR", "message": "Registration error: column \"stripe_customer_id\" of relation \"organizations\" does not exist"}
{"level": "ERROR", "message": "Failed to load Redis Lua scripts: Timeout connecting to server"}

These are expected because:

  • Database schema is missing columns (migrations not run)
  • Redis is not configured/connected
  • SendGrid email service not configured

Testing Results

Test 1: Endpoint Accessibility

curl -X POST https://auth.coditect.ai/api/v1/auth/register/ \
-H "Content-Type: application/json" \
-d '{"email":"test@coditect.ai","password":"Test123!","full_name":"Test User"}'

# Result: HTTP 500 (endpoint found, executing code)
# Before: HTTP 404 (endpoint not found)

Success Criteria Met: ✅ Endpoint accessible through Gateway

Test 2: Pod File Verification

kubectl exec -n coditect-staging coditect-backend-6857c5c9b4-mcgfk -- ls -la /app/api/v1/views/

# Results:
-rw------- auth.py (8,419 bytes)
-rw-r--r-- license.py (34,421 bytes)
-rw------- license_delivery.py (9,077 bytes)
-rw------- license_seat.py (20,847 bytes)
-rw------- subscription.py (33,196 bytes)

Success Criteria Met: ✅ All view files present in deployed pod

Test 3: Pod Health

kubectl get pods -n coditect-staging -l app=coditect-backend

# Results:
NAME READY STATUS RESTARTS AGE
coditect-backend-6857c5c9b4-mcgfk 1/1 Running 0 5m
coditect-backend-6857c5c9b4-qk2gj 1/1 Running 0 4m

Success Criteria Met: ✅ Both pods running without crashes


What We Learned

1. Docker Image Build Process

  • Lesson: Docker image contents are fixed at build time, not updated from local filesystem
  • Best Practice: Always rebuild images with --no-cache when adding new files
  • Gotcha: Multi-stage builds can hide missing files - verify with ls inside container

2. Platform Architecture Matters

  • Lesson: Docker images built on Apple Silicon (arm64) won't work on GKE (amd64)
  • Best Practice: Always specify --platform linux/amd64 for GKE deployments
  • Tool: Use docker buildx for cross-platform builds

3. Import Names Must Match Exactly

  • Lesson: Python import errors cause startup crashes, not runtime errors
  • Best Practice: Test imports locally before building Docker images
  • Tool: Use python -m py_compile to verify syntax before deployment

4. Bottom-Up Debugging Strategy

  • Lesson: Start with file presence → import errors → runtime errors
  • Best Practice: Verify files exist in container before debugging logic
  • Tool: kubectl exec to inspect container filesystem

5. Gateway API Health Checks

  • Lesson: Gateway health checks work independently from application routing
  • Context: Health checks passing doesn't mean endpoints are accessible
  • Best Practice: Test actual endpoints, not just health checks

Next Steps (Out of Scope)

To make registration fully functional, complete these infrastructure setup tasks:

1. Database Setup

Task: Run database migrations to add missing columns

# SSH to Cloud SQL or port-forward to localhost
kubectl port-forward -n coditect-staging svc/postgres 5432:5432

# Run Django migrations
python manage.py migrate

# Add missing column (temporary fix)
ALTER TABLE organizations ADD COLUMN stripe_customer_id VARCHAR(255);

Expected Outcome: Database schema matches Django models

2. Redis Configuration

Task: Deploy Redis instance and configure connection

# Deploy Redis to GKE (if not already deployed)
kubectl apply -f deployment/kubernetes/redis.yaml

# Update backend deployment with Redis URL
kubectl set env deployment/coditect-backend \
REDIS_URL=redis://redis-service:6379/0 \
-n coditect-staging

Expected Outcome: Redis Lua scripts load successfully

3. SendGrid Configuration

Task: Add SendGrid API key for email verification

# Store API key in GCP Secret Manager
gcloud secrets create sendgrid-api-key --data-file=- <<< "$SENDGRID_API_KEY"

# Mount secret in pod or set as environment variable
kubectl set env deployment/coditect-backend \
SENDGRID_API_KEY=<value-from-secret> \
-n coditect-staging

Expected Outcome: Verification emails sent successfully

4. Stripe Integration

Task: Add Stripe API keys for payment processing

# Store Stripe keys in GCP Secret Manager
gcloud secrets create stripe-secret-key --data-file=- <<< "$STRIPE_SECRET_KEY"

# Update deployment
kubectl set env deployment/coditect-backend \
STRIPE_SECRET_KEY=<value-from-secret> \
-n coditect-staging

Expected Outcome: Stripe customer IDs created on registration


Infrastructure Status

✅ Working Components

  • Gateway API: Successfully routing /api/ to backend
  • Frontend Routing: SPA serving correctly from /
  • DNS: auth.coditect.ai resolves to Gateway IP (136.110.230.30)
  • SSL/TLS: Certificate working (Gateway managed)
  • Health Checks: Backend passing liveness and readiness probes
  • Load Balancing: GCP Load Balancer distributing traffic

⚠️ Not Yet Configured

  • PostgreSQL: Database schema incomplete (missing columns)
  • Redis: Not deployed or connection not configured
  • SendGrid: API key not configured
  • Stripe: API keys not configured
  • Firebase Auth: Not yet integrated (future enhancement)

Deployment Configuration

Current Setup

Namespace: coditect-staging Image Registry: us-central1-docker.pkg.dev/coditect-cloud-infra/coditect-backend Current Version: v1.0.6-staging Replicas: 2 Platform: linux/amd64

Gateway Configuration:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
spec:
hostnames:
- "auth.coditect.ai"
rules:
- matches:
- path:
type: PathPrefix
value: /api/
backendRefs:
- name: coditect-backend-internal
port: 8000

Health Check Configuration:

apiVersion: networking.gke.io/v1
kind: HealthCheckPolicy
spec:
default:
checkIntervalSec: 10
config:
type: HTTP
httpHealthCheck:
requestPath: /api/v1/health/ready/

Rollback Instructions

If issues arise, rollback to previous stable version:

# Rollback to previous deployment
kubectl rollout undo deployment/coditect-backend -n coditect-staging

# Or rollback to specific version
kubectl rollout undo deployment/coditect-backend -n coditect-staging --to-revision=3

# Verify rollback
kubectl rollout status deployment/coditect-backend -n coditect-staging

# Check image version
kubectl get deployment coditect-backend -n coditect-staging -o jsonpath='{.spec.template.spec.containers[0].image}'

Stable Version: v1.0.3-staging (before auth.py changes)


Key Commands Reference

Build and Deploy

# Build image for GKE
docker buildx build --platform linux/amd64 --no-cache \
-t us-central1-docker.pkg.dev/coditect-cloud-infra/coditect-backend/coditect-cloud-backend:v1.0.6-staging \
--push .

# Deploy to GKE
kubectl set image deployment/coditect-backend \
backend=us-central1-docker.pkg.dev/coditect-cloud-infra/coditect-backend/coditect-cloud-backend:v1.0.6-staging \
-n coditect-staging

# Monitor deployment
kubectl rollout status deployment/coditect-backend -n coditect-staging

Debugging

# Check pod status
kubectl get pods -n coditect-staging -l app=coditect-backend

# View logs
kubectl logs -n coditect-staging <pod-name> --tail=100

# Execute commands in pod
kubectl exec -n coditect-staging <pod-name> -- ls -la /app/api/v1/views/

# Test endpoint
curl -X POST https://auth.coditect.ai/api/v1/auth/register/ \
-H "Content-Type: application/json" \
-d '{"email":"test@example.com","password":"Test123!","full_name":"Test User"}'

Rollback

# Rollback deployment
kubectl rollout undo deployment/coditect-backend -n coditect-staging

# Check rollout history
kubectl rollout history deployment/coditect-backend -n coditect-staging

Lessons for Future Deployments

1. Pre-Deployment Checklist

  • Verify all new files are committed to git
  • Build Docker image with --no-cache flag
  • Specify --platform linux/amd64 for GKE
  • Test image locally with docker run
  • Verify imports with python -m py_compile
  • Check for typos in constant/function names
  • Review migration files for schema changes

2. Deployment Verification

  • Image successfully pushed to Artifact Registry
  • Deployment rollout completed successfully
  • Pods running without restarts
  • Health checks passing
  • Files present in container filesystem
  • No import errors in pod logs
  • Endpoints returning expected status codes

3. Post-Deployment Monitoring

  • Monitor pod logs for errors
  • Check Gateway access logs
  • Verify SSL certificate validity
  • Test all critical endpoints
  • Monitor response times
  • Check error rates in logs

Contact and Support

Repository: https://github.com/coditect-ai/coditect-cloud-backend Owner: AZ1.AI INC Lead: Hal Casteel, Founder/CEO/CTO Environment: coditect-staging (GKE) Last Updated: December 1, 2025 Deployment Version: v1.0.6-staging


Appendix A: Error Timeline

TimeErrorStatus
Initial404 Not Found on /api/v1/auth/register/
After DNS Update404 persisted
After InvestigationIdentified missing view files in image🔍
First Rebuild (v1.0.5)ImportError: HEARTBEAT_SCRIPT
After Import FixPlatform mismatch error
Second Rebuild (v1.0.6)Deployment successful
Current500 Internal Server Error (expected)

Appendix B: File Changes

Modified Files

  1. api/v1/views/license.py
    • Line 38: Changed HEARTBEAT_SCRIPT to REFRESH_HEARTBEAT_SCRIPT in import
    • Line 83: Changed HEARTBEAT_SCRIPT to REFRESH_HEARTBEAT_SCRIPT in usage

New Files Deployed

  1. api/v1/views/auth.py (8,419 bytes)
  2. api/v1/views/subscription.py (33,196 bytes)
  3. api/v1/views/license_delivery.py (9,077 bytes)
  4. api/v1/views/license_seat.py (20,847 bytes)

Docker Configuration

  • Dockerfile: Multi-stage build (unchanged)
  • .dockerignore: Exclusions (unchanged)
  • Platform: Added --platform linux/amd64 to build command

End of Summary

✅ 404 Error Resolution: COMPLETE ⏸️ Full Registration Flow: PENDING (infrastructure setup required) 📊 Success Rate: 100% (routing working as expected) 🎯 Next Phase: Infrastructure Configuration