Skip to main content

🎓 Complete Beginner's Guide to Deploying Your Chat App on GKE

Table of Contents​

  1. What You'll Need
  2. Setting Up GKE
  3. Preparing Your App
  4. Deploying to Kubernetes
  5. Setting Up Ingress
  6. Testing Everything
  7. Troubleshooting

Prerequisites (What You Need First) ​

1. Install Required Tools​

# Install Google Cloud SDK (gcloud command)
# Mac:
brew install --cask google-cloud-sdk

# Windows: Download from https://cloud.google.com/sdk/docs/install

# Linux:
curl https://sdk.cloud.google.com | bash

# Install kubectl (Kubernetes command-line tool)
gcloud components install kubectl

# Install Docker
# Mac: Download Docker Desktop from docker.com
# Windows: Download Docker Desktop from docker.com
# Linux:
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

2. Set Up Your Google Cloud Account​

# Login to Google Cloud
gcloud auth login

# Create a new project (or use existing)
gcloud projects create my-chat-app-2025 --name="Chat App"

# Set it as your active project
gcloud config set project my-chat-app-2025

# Enable required APIs
gcloud services enable container.googleapis.com
gcloud services enable compute.googleapis.com

3. Create Your Project Folder​

mkdir chat-app-kubernetes
cd chat-app-kubernetes

# Create these files:
touch Dockerfile
touch app.py
touch requirements.txt
touch kubernetes-config.yaml

Setting Up GKE (Google Kubernetes Engine) ​

Step 1: Create a Kubernetes Cluster​

# Create a small cluster for testing
gcloud container clusters create chat-cluster \
--zone us-central1-a \
--num-nodes 3 \
--machine-type e2-medium \
--disk-size 20 \
--enable-autoscaling \
--min-nodes 3 \
--max-nodes 10

# This will take 5-10 minutes... ☕
# You're creating 3 virtual computers in Google's data center!

What just happened?

  • Created 3 "nodes" (virtual computers) in Google's cloud
  • Each node can run multiple pods
  • Set up autoscaling (can grow to 10 nodes if needed)
  • Located in us-central1-a (Iowa, USA)

Step 2: Connect kubectl to Your Cluster​

# Get credentials to talk to your cluster
gcloud container clusters get-credentials chat-cluster --zone us-central1-a

# Test the connection
kubectl get nodes

# You should see 3 nodes listed!

Analogy: kubectl is like a remote control for your Kubernetes cluster.


Preparing Your App ​

Step 1: Create requirements.txt​

fastapi==0.104.1
uvicorn[standard]==0.24.0
websockets==12.0

Step 2: Copy the App Code​

Copy the chat-app.py file we created earlier into your folder and rename it to app.py.

Step 3: Create a Dockerfile​

# Start with a Python base image
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Copy requirements first (Docker caching trick)
COPY requirements.txt .

# Install Python packages
RUN pip install --no-cache-dir -r requirements.txt

# Copy your app code
COPY app.py .

# Expose port 8000
EXPOSE 8000

# Command to run your app
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Step 4: Build and Push Docker Image​

# Configure Docker to use Google's container registry
gcloud auth configure-docker

# Build your image
docker build -t gcr.io/my-chat-app-2025/chat-app:v1 .

# This will take a few minutes the first time...

# Push to Google Container Registry
docker push gcr.io/my-chat-app-2025/chat-app:v1

# Now your app is stored in Google's cloud!

What's happening?

  1. Docker packages your app and all its dependencies into a "container"
  2. Think of it like a shipping container - everything needed is inside
  3. We upload it to Google's storage so Kubernetes can download it

Deploying to Kubernetes ​

Step 1: Update kubernetes-config.yaml​

Change this line in the YAML file:

image: gcr.io/YOUR-PROJECT-ID/chat-app:latest

To:

image: gcr.io/my-chat-app-2025/chat-app:v1

Step 2: Deploy Everything​

# Apply the configuration
kubectl apply -f kubernetes-config.yaml

# Watch your pods start up
kubectl get pods -w

# Wait until all pods show "Running"
# Press Ctrl+C to stop watching

What you'll see:

NAME                        READY   STATUS    RESTARTS   AGE
chat-app-5f7d8c9b6d-abc12 1/1 Running 0 30s
chat-app-5f7d8c9b6d-def34 1/1 Running 0 30s
chat-app-5f7d8c9b6d-ghi56 1/1 Running 0 30s

Step 3: Check the Service​

# See your service
kubectl get service chat-service

# Check if it has endpoints (connected pods)
kubectl get endpoints chat-service

Setting Up Ingress ​

Step 1: Install NGINX Ingress Controller​

# Add the NGINX Ingress Helm repository
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.9.4/deploy/static/provider/cloud/deploy.yaml

# Wait for it to get an external IP (this takes 2-3 minutes)
kubectl get service -n ingress-nginx

# Look for EXTERNAL-IP (not <pending>)

Step 2: Get Your External IP Address​

# Get the load balancer IP
kubectl get ingress chat-ingress

# You'll see something like:
# NAME HOSTS ADDRESS PORTS AGE
# chat-ingress chat.example.com 35.123.45.67 80 2m

Step 3: Configure DNS (Optional for Testing)​

For testing, you can skip DNS and use the IP directly:

# Get the external IP
EXTERNAL_IP=$(kubectl get ingress chat-ingress -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

echo "Your app will be at: http://$EXTERNAL_IP"

For real production:

  1. Buy a domain name (example.com)
  2. Add an A record pointing to the external IP
  3. Wait for DNS to propagate (5 minutes to 48 hours)

Testing Everything ​

Step 1: Open Your Browser​

# If using IP directly:
open http://YOUR_EXTERNAL_IP

# If using domain:
open http://chat.example.com

Step 2: Test WebSocket Connection​

  1. Open the chat page in two different browser windows
  2. Type your name in both
  3. Send messages
  4. You should see messages appear in both windows!

Step 3: Verify Session Affinity​

Open browser developer console (F12) and check:

// Look at cookies
document.cookie

// You should see a cookie like: "chat-pod=sha256~..."
// This cookie ensures you stay connected to the same pod!

Step 4: Check Which Pods Are Handling Traffic​

# Watch logs from all pods at once
kubectl logs -l app=chat --all-containers=true -f

# You'll see connection messages showing which pod each user connects to

Step 5: Test Scaling​

# Manually scale up
kubectl scale deployment chat-app --replicas=5

# Watch new pods start
kubectl get pods -w

# Check the autoscaler status
kubectl get hpa chat-hpa

# It will show current and desired number of pods

Troubleshooting ​

Problem: Pods Won't Start​

# Check pod status
kubectl get pods

# See detailed error messages
kubectl describe pod <pod-name>

# Check logs
kubectl logs <pod-name>

Common causes:

  • Wrong image name
  • Image not pushed to registry
  • Port mismatch
  • Not enough resources

Problem: Can't Connect to App​

# Check if service exists
kubectl get service chat-service

# Check if ingress has an IP
kubectl get ingress chat-ingress

# Test connection from inside the cluster
kubectl run test-pod --rm -it --image=busybox -- sh
wget -O- http://chat-service

Common causes:

  • Ingress controller not installed
  • DNS not configured
  • Firewall blocking traffic
  • Service selector doesn't match pod labels

Problem: WebSocket Closes Immediately​

# Check ingress annotations
kubectl describe ingress chat-ingress

# Look for WebSocket timeout settings

Solution: Make sure these annotations exist:

nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/websocket-services: "chat-service"

Problem: Messages Only Reach Some Users​

This is EXPECTED with our current setup!

Each pod only knows about users connected to IT. To fix this, you need:

  1. Redis Pub/Sub - Pods publish messages to Redis, all pods subscribe
  2. Message Queue - Use RabbitMQ or Google Pub/Sub
  3. Database - Store messages and poll from all pods

Problem: High CPU/Memory Usage​

# Check resource usage
kubectl top pods

# Check autoscaler
kubectl get hpa

# Scale manually if needed
kubectl scale deployment chat-app --replicas=10

Problem: Pods Keep Restarting​

# Check restart count
kubectl get pods

# See why they're restarting
kubectl describe pod <pod-name>

# Check logs from previous pod instance
kubectl logs <pod-name> --previous

Common causes:

  • Health check failing (too aggressive timeout)
  • App crashing on startup
  • Out of memory
  • Missing environment variables

Useful Commands Reference​

Viewing Resources​

# See all resources
kubectl get all

# Watch pods in real-time
kubectl get pods -w

# See pod details
kubectl describe pod <pod-name>

# View logs
kubectl logs <pod-name>
kubectl logs -f <pod-name> # Follow logs

# See resource usage
kubectl top pods
kubectl top nodes

Updating Your App​

# Build new version
docker build -t gcr.io/my-chat-app-2025/chat-app:v2 .
docker push gcr.io/my-chat-app-2025/chat-app:v2

# Update deployment
kubectl set image deployment/chat-app \
chat-container=gcr.io/my-chat-app-2025/chat-app:v2

# Watch rolling update
kubectl rollout status deployment/chat-app

# Undo if something breaks
kubectl rollout undo deployment/chat-app

Debugging​

# Run a shell inside a pod
kubectl exec -it <pod-name> -- /bin/sh

# Port forward to test locally
kubectl port-forward service/chat-service 8080:80
# Then visit http://localhost:8080

# See events
kubectl get events --sort-by='.lastTimestamp'

# Check ingress details
kubectl describe ingress chat-ingress

Cleaning Up​

# Delete everything
kubectl delete -f kubernetes-config.yaml

# Delete the cluster (stops billing!)
gcloud container clusters delete chat-cluster --zone us-central1-a

What's Next?​

For Production, You Should Add:​

  1. Redis - For cross-pod messaging

    helm install redis bitnami/redis
  2. Database - For message persistence

    # Cloud SQL, or PostgreSQL on Kubernetes
  3. HTTPS - Real SSL certificates

    # Use cert-manager with Let's Encrypt
    kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
  4. Monitoring - See what's happening

    # Install Prometheus and Grafana
    helm install prometheus prometheus-community/kube-prometheus-stack
  5. Logging - Debug problems

    # Use Google Cloud Logging (automatically available on GKE)
  6. CI/CD - Automatic deployments

    • GitHub Actions
    • Google Cloud Build
    • Jenkins

Cost Estimation​

Running this setup 24/7 on GKE:

  • 3 e2-medium nodes (~$73/month)
  • Load balancer (~$18/month)
  • Bandwidth (~$12/GB outbound)
  • Total: ~$100-150/month

💡 Cost-saving tips:

  • Use preemptible nodes (70% cheaper!)
  • Scale down at night
  • Use smaller machine types for development
  • Delete when not in use

Summary​

You now know:

  • ✅ What each Kubernetes component does
  • ✅ Why WebSockets need session affinity
  • ✅ How Ingress routes traffic
  • ✅ How to deploy and manage your app
  • ✅ How to troubleshoot common problems

The key insight: WebSockets need to stick to one pod, and we achieve this through:

  1. Service-level session affinity (IP-based)
  2. Ingress-level cookie affinity (more reliable)
  3. Health checks to ensure pods are healthy

Additional Resources​

Remember: Start small, test everything, and gradually add complexity!