C3-04: GKE Components - Container Architecture

Document Type: C4 Level 3 (Component) Diagram Container: Google Kubernetes Engine (GKE) Technology: GKE 1.28+, Kubernetes, Django REST Framework, Gunicorn Status: Specification Complete - Ready for Implementation Last Updated: November 30, 2025

Overview
Component Diagram
GKE Cluster Architecture
Kubernetes Resources
Deployment Configuration
Service Configuration
Ingress and Load Balancing
Auto-Scaling Configuration
Configuration Management
Secrets Management
Monitoring and Logging
Production Deployment

Overview

Purpose

This document specifies the component-level architecture of the Google Kubernetes Engine (GKE) cluster hosting the CODITECT License Management Platform. It provides:

Complete GKE cluster configuration (node pools, networking)
Kubernetes resource specifications (Deployments, Services, Ingress)
Django REST Framework pod architecture
Auto-scaling and high-availability patterns
Production-ready monitoring and logging integration

GKE Cluster Role

The GKE cluster serves as the container orchestration platform for:

Django REST Framework license API (primary workload)
Celery background workers (heartbeat cleanup, session management)
Redis client (connection pooling)
PostgreSQL client (connection pooling)
Monitoring and logging agents (Prometheus, Fluent Bit)

Key Features:

High Availability: Multi-zone deployment with automatic failover
Auto-Scaling: Horizontal pod autoscaling based on CPU/memory
Zero-Downtime Deployments: Rolling updates with health checks
Resource Efficiency: Preemptible nodes for cost optimization (dev)
Security: Private cluster with workload identity

Architecture Pattern

Internet
    ↓
Cloud Load Balancer (HTTPS/TLS 1.3)
    ↓
GKE Ingress Controller (nginx)
    ↓
Kubernetes Service (ClusterIP)
    ↓
Django REST Framework Pods (3 replicas)
    ├─► Cloud SQL Proxy (PostgreSQL)
    ├─► Redis Client (Memorystore)
    ├─► Cloud KMS Client (signing)
    └─► Identity Platform (authentication)

Component Diagram

GKE Internal Components

GKE Cluster Architecture

Cluster Configuration

File: opentofu/modules/gke/main.tf

/**
 * GKE Cluster Configuration
 *
 * Features:
 * - Regional cluster (multi-zone HA)
 * - Private cluster (no public IPs on nodes)
 * - Workload Identity (secure GCP service access)
 * - Binary Authorization (image security)
 * - Auto-scaling enabled
 */

resource "google_container_cluster" "primary" {
  name     = "${var.environment}-gke-cluster"
  location = var.region  # Regional = multi-zone

  # Remove default node pool (we'll create custom pools)
  remove_default_node_pool = true
  initial_node_count       = 1

  # Network configuration
  network    = var.vpc_network
  subnetwork = var.gke_subnet

  # Private cluster configuration
  private_cluster_config {
    enable_private_nodes    = true  # Nodes have private IPs only
    enable_private_endpoint = false # API endpoint is public
    master_ipv4_cidr_block  = "172.16.0.0/28"
  }

  # IP allocation for pods and services
  ip_allocation_policy {
    cluster_secondary_range_name  = "pods"
    services_secondary_range_name = "services"
  }

  # Master authorized networks (who can access API)
  master_authorized_networks_config {
    cidr_blocks {
      cidr_block   = "0.0.0.0/0"
      display_name = "All networks (for development)"
      # Production: Restrict to office IPs + CI/CD
    }
  }

  # Workload Identity (secure service account binding)
  workload_identity_config {
    workload_pool = "${var.project_id}.svc.id.goog"
  }

  # Binary Authorization (only signed images)
  binary_authorization {
    evaluation_mode = "PROJECT_SINGLETON_POLICY_ENFORCE"
  }

  # Addons
  addons_config {
    http_load_balancing {
      disabled = false  # Enable Ingress
    }
    horizontal_pod_autoscaling {
      disabled = false  # Enable HPA
    }
    network_policy_config {
      disabled = false  # Enable NetworkPolicy
    }
  }

  # Monitoring and logging
  monitoring_config {
    enable_components = ["SYSTEM_COMPONENTS", "WORKLOADS"]
  }

  logging_config {
    enable_components = ["SYSTEM_COMPONENTS", "WORKLOADS"]
  }

  # Maintenance window
  maintenance_policy {
    daily_maintenance_window {
      start_time = "03:00"  # 3 AM UTC
    }
  }

  # Resource labels
  resource_labels = {
    environment = var.environment
    project     = "coditect"
    managed_by  = "opentofu"
  }
}

Node Pool Configuration

File: opentofu/modules/gke/node_pools.tf

/**
 * Production Node Pool
 *
 * Configuration:
 * - n1-standard-2 (2 vCPU, 7.5 GB RAM)
 * - Preemptible for dev (cost savings)
 * - Auto-scaling 1-10 nodes
 * - Auto-repair and auto-upgrade enabled
 */

resource "google_container_node_pool" "primary_nodes" {
  name       = "${var.environment}-node-pool"
  location   = var.region
  cluster    = google_container_cluster.primary.name
  node_count = var.min_node_count

  # Auto-scaling configuration
  autoscaling {
    min_node_count = var.min_node_count  # Default: 1
    max_node_count = var.max_node_count  # Default: 10
  }

  # Node configuration
  node_config {
    machine_type = var.node_machine_type  # n1-standard-2

    # Use preemptible nodes for dev (70% cost savings)
    preemptible  = var.environment == "dev" ? true : false
    disk_size_gb = 50
    disk_type    = "pd-standard"

    # OAuth scopes (permissions for GCP APIs)
    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform",
    ]

    # Workload Identity (bind Kubernetes SA to GCP SA)
    workload_metadata_config {
      mode = "GKE_METADATA"
    }

    # Metadata
    metadata = {
      disable-legacy-endpoints = "true"
    }

    # Labels
    labels = {
      environment = var.environment
      node_pool   = "primary"
    }

    # Taints (if needed for dedicated workloads)
    # taint {
    #   key    = "workload-type"
    #   value  = "api"
    #   effect = "NO_SCHEDULE"
    # }

    # Security
    shielded_instance_config {
      enable_secure_boot          = true
      enable_integrity_monitoring = true
    }
  }

  # Management configuration
  management {
    auto_repair  = true
    auto_upgrade = true
  }

  # Upgrade settings
  upgrade_settings {
    max_surge       = 1
    max_unavailable = 0
  }
}

Kubernetes Resources

Namespace Configuration

File: kubernetes/base/namespace.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: coditect
  labels:
    name: coditect
    environment: production
    managed-by: opentofu

Django REST Framework Deployment

File: kubernetes/base/deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: license-api
  namespace: coditect
  labels:
    app: license-api
    component: backend
    version: v1
spec:
  replicas: 3  # High availability

  selector:
    matchLabels:
      app: license-api
      component: backend

  # Deployment strategy
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Allow 1 extra pod during update
      maxUnavailable: 0  # Zero-downtime deployments

  template:
    metadata:
      labels:
        app: license-api
        component: backend
        version: v1
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8000"
        prometheus.io/path: "/metrics"

    spec:
      # Service account with Workload Identity
      serviceAccountName: license-api-sa

      # Pod anti-affinity (spread across nodes)
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - license-api
              topologyKey: kubernetes.io/hostname

      # Init containers (run before main container)
      initContainers:
      - name: wait-for-db
        image: busybox:1.35
        command:
        - sh
        - -c
        - |
          until nc -z -v -w30 $DB_HOST $DB_PORT; do
            echo "Waiting for database connection..."
            sleep 5
          done
        env:
        - name: DB_HOST
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: host
        - name: DB_PORT
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: port

      - name: run-migrations
        image: gcr.io/coditect-cloud-infra/license-api:latest
        command:
        - python
        - manage.py
        - migrate
        - --noinput
        envFrom:
        - configMapRef:
            name: app-config
        - secretRef:
            name: db-credentials

      # Main containers
      containers:
      # Django REST Framework (Gunicorn)
      - name: django
        image: gcr.io/coditect-cloud-infra/license-api:latest
        imagePullPolicy: Always

        # Command (override Dockerfile CMD)
        command:
        - gunicorn
        - config.wsgi:application
        - --bind=0.0.0.0:8000
        - --workers=4
        - --threads=2
        - --worker-class=gthread
        - --worker-tmp-dir=/dev/shm
        - --timeout=60
        - --access-logfile=-
        - --error-logfile=-
        - --log-level=info

        # Ports
        ports:
        - name: http
          containerPort: 8000
          protocol: TCP

        # Environment variables
        envFrom:
        - configMapRef:
            name: app-config
        - secretRef:
            name: db-credentials
        env:
        - name: DJANGO_SETTINGS_MODULE
          value: "config.settings.production"
        - name: FIREBASE_SERVICE_ACCOUNT_PATH
          value: "/secrets/firebase-service-account.json"

        # Resource limits
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"

        # Health checks
        livenessProbe:
          httpGet:
            path: /health/live
            port: http
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3

        readinessProbe:
          httpGet:
            path: /health/ready
            port: http
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3

        startupProbe:
          httpGet:
            path: /health/startup
            port: http
          initialDelaySeconds: 0
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 30

        # Volume mounts
        volumeMounts:
        - name: firebase-credentials
          mountPath: /secrets
          readOnly: true
        - name: tmp
          mountPath: /tmp
        - name: shm
          mountPath: /dev/shm

      # Cloud SQL Proxy (sidecar)
      - name: cloud-sql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.33.2
        command:
        - /cloud_sql_proxy
        - -instances=$(INSTANCE_CONNECTION_NAME)=tcp:5432
        - -credential_file=/secrets/service-account.json
        env:
        - name: INSTANCE_CONNECTION_NAME
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: instance_connection_name
        resources:
          requests:
            memory: "64Mi"
            cpu: "50m"
          limits:
            memory: "128Mi"
            cpu: "100m"
        volumeMounts:
        - name: cloudsql-credentials
          mountPath: /secrets
          readOnly: true

      # Volumes
      volumes:
      - name: firebase-credentials
        secret:
          secretName: firebase-service-account
          items:
          - key: service-account.json
            path: firebase-service-account.json
      - name: cloudsql-credentials
        secret:
          secretName: cloudsql-service-account
          items:
          - key: service-account.json
            path: service-account.json
      - name: tmp
        emptyDir: {}
      - name: shm
        emptyDir:
          medium: Memory
          sizeLimit: 256Mi

      # Security context
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000

Service Configuration

ClusterIP Service for Django API

File: kubernetes/base/service.yaml

apiVersion: v1
kind: Service
metadata:
  name: license-api
  namespace: coditect
  labels:
    app: license-api
    component: backend
spec:
  type: ClusterIP  # Internal only (Ingress routes to this)

  selector:
    app: license-api
    component: backend

  ports:
  - name: http
    port: 8000
    targetPort: http
    protocol: TCP

  # Session affinity (optional - for sticky sessions)
  # sessionAffinity: ClientIP
  # sessionAffinityConfig:
  #   clientIP:
  #     timeoutSeconds: 10800  # 3 hours

Headless Service for StatefulSet (if needed)

File: kubernetes/base/service-headless.yaml

# Headless service for StatefulSet workloads (e.g., Celery workers)
apiVersion: v1
kind: Service
metadata:
  name: celery-workers
  namespace: coditect
  labels:
    app: celery-workers
    component: background
spec:
  clusterIP: None  # Headless (no load balancing)

  selector:
    app: celery-workers
    component: background

  ports:
  - name: flower
    port: 5555
    targetPort: 5555

Ingress and Load Balancing

Ingress Configuration

File: kubernetes/base/ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: license-api-ingress
  namespace: coditect
  annotations:
    # Use Google Cloud Load Balancer
    kubernetes.io/ingress.class: "gce"

    # Enable HTTPS redirect
    kubernetes.io/ingress.allow-http: "false"

    # Managed certificate (GCP)
    networking.gke.io/managed-certificates: "license-api-cert"

    # Cloud Armor security policy
    cloud.google.com/armor-config: '{"license-api-policy": "license-api-security-policy"}'

    # Backend configuration
    cloud.google.com/backend-config: '{"default": "license-api-backend-config"}'

    # Client body size limit
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"

    # Timeouts
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "60"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "60"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "60"

  labels:
    app: license-api
spec:
  # TLS configuration
  tls:
  - hosts:
    - api.coditect.com
    secretName: tls-certificate  # Or use managed certificate

  # Routing rules
  rules:
  - host: api.coditect.com
    http:
      paths:
      # API v1 routes
      - path: /api/v1/*
        pathType: ImplementationSpecific
        backend:
          service:
            name: license-api
            port:
              number: 8000

      # Health check endpoint (for load balancer)
      - path: /health/*
        pathType: ImplementationSpecific
        backend:
          service:
            name: license-api
            port:
              number: 8000

Managed Certificate

File: kubernetes/base/managed-certificate.yaml

# Google Managed SSL Certificate
apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
  name: license-api-cert
  namespace: coditect
spec:
  domains:
  - api.coditect.com

Backend Configuration

File: kubernetes/base/backend-config.yaml

# Backend configuration for GCP Load Balancer
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: license-api-backend-config
  namespace: coditect
spec:
  # Health check configuration
  healthCheck:
    checkIntervalSec: 10
    timeoutSec: 5
    healthyThreshold: 2
    unhealthyThreshold: 3
    type: HTTP
    requestPath: /health/ready
    port: 8000

  # Connection draining (graceful shutdown)
  connectionDraining:
    drainingTimeoutSec: 60

  # Session affinity (optional)
  sessionAffinity:
    affinityType: "CLIENT_IP"
    affinityCookieTtlSec: 10800  # 3 hours

  # Custom request/response headers
  customRequestHeaders:
    headers:
    - "X-Client-Region:{client_region}"
    - "X-Client-City:{client_city}"

  # Security
  securityPolicy:
    name: "license-api-security-policy"

  # CDN (if needed for static assets)
  cdn:
    enabled: false
    cachePolicy:
      includeHost: true
      includeProtocol: true
      includeQueryString: false

Auto-Scaling Configuration

Horizontal Pod Autoscaler

File: kubernetes/base/hpa.yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: license-api-hpa
  namespace: coditect
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: license-api

  # Replica configuration
  minReplicas: 3   # Always maintain 3 for HA
  maxReplicas: 10  # Scale up to 10 under load

  # Scaling behavior
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # Wait 5 min before scale down
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60  # Max 50% scale down per minute
    scaleUp:
      stabilizationWindowSeconds: 0  # Scale up immediately
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15  # Max 100% scale up per 15 seconds

  # Metrics to scale on
  metrics:
  # CPU utilization
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70  # Scale up when avg CPU > 70%

  # Memory utilization
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80  # Scale up when avg memory > 80%

  # Custom metric: Requests per second (optional)
  # - type: Pods
  #   pods:
  #     metric:
  #       name: http_requests_per_second
  #     target:
  #       type: AverageValue
  #       averageValue: "1000"  # Scale up when RPS > 1000

Cluster Autoscaler

Configured in GKE node pool (see Node Pool Configuration above)

Automatically adds/removes nodes based on pod resource requests
Min nodes: 1 (dev), 3 (production)
Max nodes: 10
Scale-down delay: 10 minutes

Configuration Management

ConfigMap for Application Configuration

File: kubernetes/base/configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
  namespace: coditect
data:
  # Django settings
  DJANGO_SETTINGS_MODULE: "config.settings.production"
  ALLOWED_HOSTS: "api.coditect.com,*.coditect.com"

  # Database configuration
  DB_ENGINE: "django.db.backends.postgresql"
  DB_PORT: "5432"
  DB_CONN_MAX_AGE: "600"  # 10 minutes
  DB_CONN_HEALTH_CHECKS: "true"

  # Redis configuration
  REDIS_HOST: "10.121.42.67"
  REDIS_PORT: "6378"
  REDIS_DB: "0"
  REDIS_MAX_CONNECTIONS: "50"

  # Celery configuration
  CELERY_BROKER_URL: "redis://10.121.42.67:6378/1"
  CELERY_RESULT_BACKEND: "redis://10.121.42.67:6378/2"

  # Firebase/Identity Platform
  FIREBASE_PROJECT_ID: "coditect-cloud-infra"

  # Cloud KMS
  KMS_PROJECT_ID: "coditect-cloud-infra"
  KMS_LOCATION: "us-central1"
  KMS_KEYRING: "license-signing"
  KMS_KEY: "license-key"

  # Application settings
  LOG_LEVEL: "INFO"
  DEBUG: "False"
  CORS_ALLOWED_ORIGINS: "https://app.coditect.com"

  # Feature flags
  ENABLE_SWAGGER: "False"
  ENABLE_METRICS: "True"

Secrets Management

Database Credentials Secret

File: kubernetes/secrets/db-credentials.yaml

apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
  namespace: coditect
type: Opaque
stringData:
  # Database connection
  DB_NAME: coditect_licenses
  DB_USER: license_api_user
  DB_PASSWORD: "REPLACE_WITH_SECRET_MANAGER_VALUE"
  DB_HOST: "127.0.0.1"  # Via Cloud SQL Proxy
  DB_PORT: "5432"

  # Cloud SQL instance connection
  INSTANCE_CONNECTION_NAME: "coditect-cloud-infra:us-central1:coditect-postgres-dev"

Note: In production, use External Secrets Operator to sync from GCP Secret Manager:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
  namespace: coditect
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: gcpsm-secret-store
    kind: SecretStore
  target:
    name: db-credentials
    creationPolicy: Owner
  data:
  - secretKey: DB_PASSWORD
    remoteRef:
      key: db-password  # Secret Manager secret name

Firebase Service Account Secret

File: kubernetes/secrets/firebase-service-account.yaml

apiVersion: v1
kind: Secret
metadata:
  name: firebase-service-account
  namespace: coditect
type: Opaque
stringData:
  service-account.json: |
    {
      "type": "service_account",
      "project_id": "coditect-cloud-infra",
      "private_key_id": "REPLACE_WITH_ACTUAL_KEY_ID",
      "private_key": "-----BEGIN PRIVATE KEY-----\nREPLACE_WITH_ACTUAL_KEY\n-----END PRIVATE KEY-----\n",
      "client_email": "firebase-adminsdk-...@coditect-cloud-infra.iam.gserviceaccount.com",
      "client_id": "REPLACE_WITH_ACTUAL_CLIENT_ID",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      "token_uri": "https://oauth2.googleapis.com/token",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/firebase-adminsdk-...%40coditect-cloud-infra.iam.gserviceaccount.com"
    }

Monitoring and Logging

Prometheus ServiceMonitor

File: kubernetes/monitoring/servicemonitor.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: license-api-metrics
  namespace: coditect
  labels:
    app: license-api
spec:
  selector:
    matchLabels:
      app: license-api

  endpoints:
  - port: http
    path: /metrics
    interval: 30s
    scrapeTimeout: 10s

Fluent Bit DaemonSet

File: kubernetes/logging/fluent-bit.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: kube-system
  labels:
    app: fluent-bit
spec:
  selector:
    matchLabels:
      app: fluent-bit
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      serviceAccountName: fluent-bit
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:2.0
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config

Production Deployment

Kustomization for Environment-Specific Configuration

File: kubernetes/overlays/production/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: coditect

# Base resources
resources:
- ../../base

# ConfigMap generator
configMapGenerator:
- name: app-config
  behavior: merge
  literals:
  - LOG_LEVEL=INFO
  - DEBUG=False
  - ENVIRONMENT=production

# Secret generator (from files)
secretGenerator:
- name: db-credentials
  files:
  - db-password=secrets/db-password.txt
  - instance-connection-name=secrets/instance-connection-name.txt

# Image tags
images:
- name: gcr.io/coditect-cloud-infra/license-api
  newTag: v1.0.0

# Replica overrides
replicas:
- name: license-api
  count: 3

# Resource patches
patchesStrategicMerge:
- deployment-patch.yaml
- hpa-patch.yaml

Deployment Patch for Production

File: kubernetes/overlays/production/deployment-patch.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: license-api
spec:
  replicas: 3  # Ensure HA
  template:
    spec:
      containers:
      - name: django
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
        # Production-specific environment variables
        env:
        - name: DJANGO_SETTINGS_MODULE
          value: "config.settings.production"
        - name: GUNICORN_WORKERS
          value: "8"

Summary

This C3-04 GKE Components specification provides:

✅ Complete GKE cluster configuration

Regional multi-zone cluster for HA
Private nodes with Workload Identity
Auto-scaling node pools (1-10 nodes)
Binary authorization for security

✅ Kubernetes resource specifications

Django REST Framework Deployment (3 replicas)
Cloud SQL Proxy sidecar
ClusterIP Service for internal routing
Ingress with Google Cloud Load Balancer

✅ Auto-scaling configuration

HorizontalPodAutoscaler (3-10 replicas)
CPU and memory-based scaling
Cluster autoscaler for nodes

✅ Configuration management

ConfigMap for application settings
Secrets for sensitive data
External Secrets Operator integration

✅ Monitoring and logging

Prometheus ServiceMonitor
Fluent Bit DaemonSet
Cloud Logging integration

✅ Production deployment

Kustomize overlays for environments
Zero-downtime rolling updates
Health checks and readiness probes

Implementation Status: Specification Complete Next Steps:

Deploy GKE cluster (already complete ✅)
Create Kubernetes manifests
Deploy Django REST Framework application
Configure Ingress and load balancer
Set up monitoring and logging
Test auto-scaling behavior

Current Infrastructure:

GKE Cluster: ✅ Deployed
Node Pool: ✅ 3 nodes (n1-standard-2)
VPC Network: ✅ Configured
Cloud NAT: ✅ Configured

Pending:

Django application deployment (Phase 2)
Ingress configuration (Phase 3)
TLS certificate provisioning (Phase 3)

Total Lines: 900+ (complete production-ready Kubernetes configuration)

Author: CODITECT Infrastructure Team Date: November 30, 2025 Version: 1.0 Status: Ready for Implementation

Table of Contents​

Overview​

Purpose​

GKE Cluster Role​

Architecture Pattern​

Component Diagram​

GKE Internal Components​

GKE Cluster Architecture​

Cluster Configuration​

Node Pool Configuration​

Kubernetes Resources​

Namespace Configuration​

Django REST Framework Deployment​

Service Configuration​

ClusterIP Service for Django API​

Headless Service for StatefulSet (if needed)​

Ingress and Load Balancing​

Ingress Configuration​

Managed Certificate​

Backend Configuration​

Auto-Scaling Configuration​

Horizontal Pod Autoscaler​

Cluster Autoscaler​

Configuration Management​

ConfigMap for Application Configuration​

Secrets Management​

Database Credentials Secret​

Firebase Service Account Secret​

Monitoring and Logging​

Prometheus ServiceMonitor​

Fluent Bit DaemonSet​

Production Deployment​

Kustomization for Environment-Specific Configuration​

Deployment Patch for Production​

Summary​

Table of Contents