🎓 Complete Beginner's Guide to Deploying Your Chat App on GKE
Table of Contents​
- What You'll Need
- Setting Up GKE
- Preparing Your App
- Deploying to Kubernetes
- Setting Up Ingress
- Testing Everything
- Troubleshooting
Prerequisites (What You Need First) ​
1. Install Required Tools​
# Install Google Cloud SDK (gcloud command)
# Mac:
brew install --cask google-cloud-sdk
# Windows: Download from https://cloud.google.com/sdk/docs/install
# Linux:
curl https://sdk.cloud.google.com | bash
# Install kubectl (Kubernetes command-line tool)
gcloud components install kubectl
# Install Docker
# Mac: Download Docker Desktop from docker.com
# Windows: Download Docker Desktop from docker.com
# Linux:
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
2. Set Up Your Google Cloud Account​
# Login to Google Cloud
gcloud auth login
# Create a new project (or use existing)
gcloud projects create my-chat-app-2025 --name="Chat App"
# Set it as your active project
gcloud config set project my-chat-app-2025
# Enable required APIs
gcloud services enable container.googleapis.com
gcloud services enable compute.googleapis.com
3. Create Your Project Folder​
mkdir chat-app-kubernetes
cd chat-app-kubernetes
# Create these files:
touch Dockerfile
touch app.py
touch requirements.txt
touch kubernetes-config.yaml
Setting Up GKE (Google Kubernetes Engine) ​
Step 1: Create a Kubernetes Cluster​
# Create a small cluster for testing
gcloud container clusters create chat-cluster \
--zone us-central1-a \
--num-nodes 3 \
--machine-type e2-medium \
--disk-size 20 \
--enable-autoscaling \
--min-nodes 3 \
--max-nodes 10
# This will take 5-10 minutes... ☕
# You're creating 3 virtual computers in Google's data center!
What just happened?
- Created 3 "nodes" (virtual computers) in Google's cloud
- Each node can run multiple pods
- Set up autoscaling (can grow to 10 nodes if needed)
- Located in us-central1-a (Iowa, USA)
Step 2: Connect kubectl to Your Cluster​
# Get credentials to talk to your cluster
gcloud container clusters get-credentials chat-cluster --zone us-central1-a
# Test the connection
kubectl get nodes
# You should see 3 nodes listed!
Analogy: kubectl is like a remote control for your Kubernetes cluster.
Preparing Your App ​
Step 1: Create requirements.txt​
fastapi==0.104.1
uvicorn[standard]==0.24.0
websockets==12.0
Step 2: Copy the App Code​
Copy the chat-app.py file we created earlier into your folder and rename it to app.py.
Step 3: Create a Dockerfile​
# Start with a Python base image
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Copy requirements first (Docker caching trick)
COPY requirements.txt .
# Install Python packages
RUN pip install --no-cache-dir -r requirements.txt
# Copy your app code
COPY app.py .
# Expose port 8000
EXPOSE 8000
# Command to run your app
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
Step 4: Build and Push Docker Image​
# Configure Docker to use Google's container registry
gcloud auth configure-docker
# Build your image
docker build -t gcr.io/my-chat-app-2025/chat-app:v1 .
# This will take a few minutes the first time...
# Push to Google Container Registry
docker push gcr.io/my-chat-app-2025/chat-app:v1
# Now your app is stored in Google's cloud!
What's happening?
- Docker packages your app and all its dependencies into a "container"
- Think of it like a shipping container - everything needed is inside
- We upload it to Google's storage so Kubernetes can download it
Deploying to Kubernetes ​
Step 1: Update kubernetes-config.yaml​
Change this line in the YAML file:
image: gcr.io/YOUR-PROJECT-ID/chat-app:latest
To:
image: gcr.io/my-chat-app-2025/chat-app:v1
Step 2: Deploy Everything​
# Apply the configuration
kubectl apply -f kubernetes-config.yaml
# Watch your pods start up
kubectl get pods -w
# Wait until all pods show "Running"
# Press Ctrl+C to stop watching
What you'll see:
NAME READY STATUS RESTARTS AGE
chat-app-5f7d8c9b6d-abc12 1/1 Running 0 30s
chat-app-5f7d8c9b6d-def34 1/1 Running 0 30s
chat-app-5f7d8c9b6d-ghi56 1/1 Running 0 30s
Step 3: Check the Service​
# See your service
kubectl get service chat-service
# Check if it has endpoints (connected pods)
kubectl get endpoints chat-service
Setting Up Ingress ​
Step 1: Install NGINX Ingress Controller​
# Add the NGINX Ingress Helm repository
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.9.4/deploy/static/provider/cloud/deploy.yaml
# Wait for it to get an external IP (this takes 2-3 minutes)
kubectl get service -n ingress-nginx
# Look for EXTERNAL-IP (not <pending>)
Step 2: Get Your External IP Address​
# Get the load balancer IP
kubectl get ingress chat-ingress
# You'll see something like:
# NAME HOSTS ADDRESS PORTS AGE
# chat-ingress chat.example.com 35.123.45.67 80 2m
Step 3: Configure DNS (Optional for Testing)​
For testing, you can skip DNS and use the IP directly:
# Get the external IP
EXTERNAL_IP=$(kubectl get ingress chat-ingress -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "Your app will be at: http://$EXTERNAL_IP"
For real production:
- Buy a domain name (example.com)
- Add an A record pointing to the external IP
- Wait for DNS to propagate (5 minutes to 48 hours)
Testing Everything ​
Step 1: Open Your Browser​
# If using IP directly:
open http://YOUR_EXTERNAL_IP
# If using domain:
open http://chat.example.com
Step 2: Test WebSocket Connection​
- Open the chat page in two different browser windows
- Type your name in both
- Send messages
- You should see messages appear in both windows!
Step 3: Verify Session Affinity​
Open browser developer console (F12) and check:
// Look at cookies
document.cookie
// You should see a cookie like: "chat-pod=sha256~..."
// This cookie ensures you stay connected to the same pod!
Step 4: Check Which Pods Are Handling Traffic​
# Watch logs from all pods at once
kubectl logs -l app=chat --all-containers=true -f
# You'll see connection messages showing which pod each user connects to
Step 5: Test Scaling​
# Manually scale up
kubectl scale deployment chat-app --replicas=5
# Watch new pods start
kubectl get pods -w
# Check the autoscaler status
kubectl get hpa chat-hpa
# It will show current and desired number of pods
Troubleshooting ​
Problem: Pods Won't Start​
# Check pod status
kubectl get pods
# See detailed error messages
kubectl describe pod <pod-name>
# Check logs
kubectl logs <pod-name>
Common causes:
- Wrong image name
- Image not pushed to registry
- Port mismatch
- Not enough resources
Problem: Can't Connect to App​
# Check if service exists
kubectl get service chat-service
# Check if ingress has an IP
kubectl get ingress chat-ingress
# Test connection from inside the cluster
kubectl run test-pod --rm -it --image=busybox -- sh
wget -O- http://chat-service
Common causes:
- Ingress controller not installed
- DNS not configured
- Firewall blocking traffic
- Service selector doesn't match pod labels
Problem: WebSocket Closes Immediately​
# Check ingress annotations
kubectl describe ingress chat-ingress
# Look for WebSocket timeout settings
Solution: Make sure these annotations exist:
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/websocket-services: "chat-service"
Problem: Messages Only Reach Some Users​
This is EXPECTED with our current setup!
Each pod only knows about users connected to IT. To fix this, you need:
- Redis Pub/Sub - Pods publish messages to Redis, all pods subscribe
- Message Queue - Use RabbitMQ or Google Pub/Sub
- Database - Store messages and poll from all pods
Problem: High CPU/Memory Usage​
# Check resource usage
kubectl top pods
# Check autoscaler
kubectl get hpa
# Scale manually if needed
kubectl scale deployment chat-app --replicas=10
Problem: Pods Keep Restarting​
# Check restart count
kubectl get pods
# See why they're restarting
kubectl describe pod <pod-name>
# Check logs from previous pod instance
kubectl logs <pod-name> --previous
Common causes:
- Health check failing (too aggressive timeout)
- App crashing on startup
- Out of memory
- Missing environment variables
Useful Commands Reference​
Viewing Resources​
# See all resources
kubectl get all
# Watch pods in real-time
kubectl get pods -w
# See pod details
kubectl describe pod <pod-name>
# View logs
kubectl logs <pod-name>
kubectl logs -f <pod-name> # Follow logs
# See resource usage
kubectl top pods
kubectl top nodes
Updating Your App​
# Build new version
docker build -t gcr.io/my-chat-app-2025/chat-app:v2 .
docker push gcr.io/my-chat-app-2025/chat-app:v2
# Update deployment
kubectl set image deployment/chat-app \
chat-container=gcr.io/my-chat-app-2025/chat-app:v2
# Watch rolling update
kubectl rollout status deployment/chat-app
# Undo if something breaks
kubectl rollout undo deployment/chat-app
Debugging​
# Run a shell inside a pod
kubectl exec -it <pod-name> -- /bin/sh
# Port forward to test locally
kubectl port-forward service/chat-service 8080:80
# Then visit http://localhost:8080
# See events
kubectl get events --sort-by='.lastTimestamp'
# Check ingress details
kubectl describe ingress chat-ingress
Cleaning Up​
# Delete everything
kubectl delete -f kubernetes-config.yaml
# Delete the cluster (stops billing!)
gcloud container clusters delete chat-cluster --zone us-central1-a
What's Next?​
For Production, You Should Add:​
-
Redis - For cross-pod messaging
helm install redis bitnami/redis -
Database - For message persistence
# Cloud SQL, or PostgreSQL on Kubernetes -
HTTPS - Real SSL certificates
# Use cert-manager with Let's Encrypt
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml -
Monitoring - See what's happening
# Install Prometheus and Grafana
helm install prometheus prometheus-community/kube-prometheus-stack -
Logging - Debug problems
# Use Google Cloud Logging (automatically available on GKE) -
CI/CD - Automatic deployments
- GitHub Actions
- Google Cloud Build
- Jenkins
Cost Estimation​
Running this setup 24/7 on GKE:
- 3 e2-medium nodes (~$73/month)
- Load balancer (~$18/month)
- Bandwidth (~$12/GB outbound)
- Total: ~$100-150/month
💡 Cost-saving tips:
- Use preemptible nodes (70% cheaper!)
- Scale down at night
- Use smaller machine types for development
- Delete when not in use
Summary​
You now know:
- ✅ What each Kubernetes component does
- ✅ Why WebSockets need session affinity
- ✅ How Ingress routes traffic
- ✅ How to deploy and manage your app
- ✅ How to troubleshoot common problems
The key insight: WebSockets need to stick to one pod, and we achieve this through:
- Service-level session affinity (IP-based)
- Ingress-level cookie affinity (more reliable)
- Health checks to ensure pods are healthy
Additional Resources​
Remember: Start small, test everything, and gradually add complexity!