Sequence Diagram: GKE Deployment Flow
Purpose: Complete Django application deployment to Google Kubernetes Engine with Docker, Cloud Build, and rolling updates.
Actors:
- Developer (triggering deployment)
- GitHub Actions (CI/CD)
- Cloud Build (image building)
- Artifact Registry (image storage)
- GKE (Kubernetes cluster)
- Load Balancer (traffic routing)
Flow: Code push → Docker build → Image push → Kubernetes deployment → Health check → Traffic cutover
Mermaid Sequence Diagram
Step-by-Step Breakdown
1. CI Pipeline (Steps 1-2)
GitHub Actions workflow:
# .github/workflows/deploy.yml
name: Deploy to GKE
on:
push:
branches:
- main
env:
GCP_PROJECT_ID: coditect-cloud-infra
GKE_CLUSTER: license-api-cluster
GKE_ZONE: us-central1-a
IMAGE: gcr.io/coditect-cloud-infra/license-api
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install pytest flake8 mypy
- name: Run tests
run: |
pytest tests/ -v --cov=app --cov-report=term-missing
flake8 app/ --max-line-length=100
mypy app/ --strict
deploy:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Authenticate to Google Cloud
uses: google-github-actions/auth@v1
with:
credentials_json: ${{ secrets.GCP_SA_KEY }}
- name: Set up Cloud SDK
uses: google-github-actions/setup-gcloud@v1
- name: Configure Docker
run: |
gcloud auth configure-docker
- name: Build and push image
run: |
gcloud builds submit \
--tag ${{ env.IMAGE }}:${{ github.sha }} \
--tag ${{ env.IMAGE }}:latest
- name: Get GKE credentials
run: |
gcloud container clusters get-credentials ${{ env.GKE_CLUSTER }} \
--zone ${{ env.GKE_ZONE }}
- name: Deploy to GKE
run: |
kubectl set image deployment/license-api \
license-api=${{ env.IMAGE }}:${{ github.sha }}
kubectl rollout status deployment/license-api --timeout=5m
- name: Verify deployment
run: |
kubectl get pods -l app=license-api
kubectl get service license-api
2. Cloud Build Configuration (Step 5)
Cloud Build configuration:
# cloudbuild.yaml
steps:
# Step 1: Build Docker image
- name: 'gcr.io/cloud-builders/docker'
args:
- 'build'
- '-t'
- 'gcr.io/$PROJECT_ID/license-api:$COMMIT_SHA'
- '-t'
- 'gcr.io/$PROJECT_ID/license-api:latest'
- '--build-arg'
- 'COMMIT_SHA=$COMMIT_SHA'
- '.'
# Step 2: Push to Artifact Registry
- name: 'gcr.io/cloud-builders/docker'
args:
- 'push'
- 'gcr.io/$PROJECT_ID/license-api:$COMMIT_SHA'
- name: 'gcr.io/cloud-builders/docker'
args:
- 'push'
- 'gcr.io/$PROJECT_ID/license-api:latest'
# Step 3: Deploy to GKE
- name: 'gcr.io/cloud-builders/kubectl'
args:
- 'set'
- 'image'
- 'deployment/license-api'
- 'license-api=gcr.io/$PROJECT_ID/license-api:$COMMIT_SHA'
env:
- 'CLOUDSDK_COMPUTE_ZONE=us-central1-a'
- 'CLOUDSDK_CONTAINER_CLUSTER=license-api-cluster'
# Step 4: Wait for rollout
- name: 'gcr.io/cloud-builders/kubectl'
args:
- 'rollout'
- 'status'
- 'deployment/license-api'
- '--timeout=5m'
env:
- 'CLOUDSDK_COMPUTE_ZONE=us-central1-a'
- 'CLOUDSDK_CONTAINER_CLUSTER=license-api-cluster'
timeout: 1200s # 20 minutes
options:
machineType: 'N1_HIGHCPU_8'
3. Dockerfile (Step 5)
Multi-stage Docker build:
# Dockerfile
FROM python:3.11-slim as builder
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application
COPY app/ ./app/
# Production stage
FROM python:3.11-slim
WORKDIR /app
# Copy from builder
COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
COPY --from=builder /app ./
# Create non-root user
RUN useradd -m -u 1000 appuser && \
chown -R appuser:appuser /app
USER appuser
# Health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8000/health')"
# Run application
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
4. Kubernetes Deployment (Steps 9-20)
Kubernetes deployment manifest:
# kubernetes/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: license-api
namespace: default
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0 # Never take down old pods
maxSurge: 1 # Create 1 new pod at a time
selector:
matchLabels:
app: license-api
template:
metadata:
labels:
app: license-api
version: v1
spec:
serviceAccountName: license-api-sa
containers:
- name: license-api
image: gcr.io/coditect-cloud-infra/license-api:latest
imagePullPolicy: Always
ports:
- containerPort: 8000
name: http
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: license-api-secrets
key: database-url
- name: REDIS_HOST
value: "10.0.0.3" # Redis Memorystore IP
- name: STRIPE_SECRET_KEY
valueFrom:
secretKeyRef:
name: license-api-secrets
key: stripe-secret-key
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
# Readiness probe - when to add to load balancer
readinessProbe:
httpGet:
path: /health/ready
port: 8000
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
successThreshold: 1
failureThreshold: 3
# Liveness probe - when to restart pod
livenessProbe:
httpGet:
path: /health/live
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
# Graceful shutdown
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"]
terminationGracePeriodSeconds: 30
Health check endpoints:
# app/health.py
from rest_framework import viewsets, status
from rest_framework.decorators import action
from rest_framework.decorators import api_view, permission_classes
from rest_framework.permissions import AllowAny
from rest_framework.response import Response
from rest_framework import status, serializers
from apps.core.database import check_database_connection
from apps.core.redis import check_redis_connection
# Response Serializer
class HealthResponseSerializer(serializers.Serializer):
status = serializers.CharField()
database = serializers.CharField()
redis = serializers.CharField()
@api_view(['GET'])
@permission_classes([AllowAny]) # Health checks don't require auth
def readiness_check(request):
"""
Readiness probe - is app ready to serve traffic?
Checks:
- Database connection
- Redis connection
- App initialization complete
Returns 200 if ready, 503 if not ready.
"""
db_status = "connected" if check_database_connection() else "disconnected"
redis_status = "connected" if check_redis_connection() else "disconnected"
if db_status != "connected" or redis_status != "connected":
return Response(
{
"status": "not_ready",
"database": db_status,
"redis": redis_status
},
status=status.HTTP_503_SERVICE_UNAVAILABLE
)
return Response(
{
"status": "ready",
"database": db_status,
"redis": redis_status
},
status=status.HTTP_200_OK
)
@api_view(['GET'])
@permission_classes([AllowAny]) # Health checks don't require auth
def liveness_check(request):
"""
Liveness probe - is app alive?
Simple check that app process is running.
If this fails, Kubernetes will restart the pod.
"""
return Response(
{"status": "alive"},
status=status.HTTP_200_OK
)
Rollback Strategy
Automatic rollback on failure:
# Manual rollback (if needed)
kubectl rollout undo deployment/license-api
# Rollback to specific revision
kubectl rollout history deployment/license-api
kubectl rollout undo deployment/license-api --to-revision=3
# Pause rollout (emergency)
kubectl rollout pause deployment/license-api
# Resume rollout
kubectl rollout resume deployment/license-api
Auto-rollback via readiness probe:
- If new pod fails readiness checks for 3 attempts
- Kubernetes won't route traffic to it
- Old pods remain serving traffic
- Deployment effectively "rolled back" automatically
Monitoring Deployment
Check deployment status:
# Watch rollout status
kubectl rollout status deployment/license-api
# Get pod events
kubectl get events --sort-by=.metadata.creationTimestamp
# View pod logs
kubectl logs -f deployment/license-api
# Describe deployment
kubectl describe deployment license-api
# Get replica sets
kubectl get rs -l app=license-api
Related Documentation
- ADR-009: GCP Infrastructure Architecture
- ADR-019: Monitoring and Observability
- ADR-020: Security Hardening
Last Updated: 2025-11-30 Diagram Type: Sequence (Mermaid) Scope: Infrastructure - GKE deployment