Skip to main content

Deployment Guide - AI-Powered PDF Analysis Platform

This guide covers deploying the PDF Analysis Platform to Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE) and Cloud Build.


Table of Contents

  1. Prerequisites
  2. Local Development
  3. GCP Setup
  4. Database Setup
  5. Container Registry
  6. Kubernetes Deployment
  7. CI/CD with Cloud Build
  8. Environment Variables
  9. Monitoring & Logging
  10. Troubleshooting

Prerequisites

Required Tools

GCP Services

  • Google Kubernetes Engine (GKE)
  • Cloud SQL (PostgreSQL)
  • Cloud Build
  • Cloud Storage
  • Container Registry / Artifact Registry
  • Cloud Memorystore (Redis)

Required Permissions

  • Kubernetes Engine Admin
  • Cloud SQL Admin
  • Cloud Build Editor
  • Storage Admin
  • Service Account Admin

Local Development

1. Clone Repository

git clone https://github.com/coditect-ai/coditect-pdf-convertor.git
cd coditect-pdf-convertor

2. Start Services

# Start PostgreSQL and Redis
docker-compose up -d postgres redis

# Initialize database
docker-compose run --rm backend python scripts/init_database.py

# Start backend
docker-compose up -d backend

# Start frontend
cd frontend
npm install
npm run dev

3. Access Application


GCP Setup

1. Create GCP Project

# Set project ID
export PROJECT_ID="pdf-analysis-prod"
export REGION="us-central1"
export ZONE="us-central1-a"

# Create project
gcloud projects create $PROJECT_ID --name="PDF Analysis Platform"

# Set as default
gcloud config set project $PROJECT_ID

# Enable billing (replace BILLING_ACCOUNT_ID)
gcloud beta billing projects link $PROJECT_ID \
--billing-account=BILLING_ACCOUNT_ID

2. Enable Required APIs

gcloud services enable \
container.googleapis.com \
cloudbuild.googleapis.com \
sql-component.googleapis.com \
sqladmin.googleapis.com \
storage-api.googleapis.com \
redis.googleapis.com \
compute.googleapis.com \
artifactregistry.googleapis.com

3. Create Service Accounts

# Backend service account
gcloud iam service-accounts create pdf-backend \
--display-name="PDF Backend Service"

# Cloud Build service account
gcloud iam service-accounts create pdf-cloudbuild \
--display-name="PDF Cloud Build"

# Grant permissions
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:pdf-backend@$PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/cloudsql.client"

gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:pdf-backend@$PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"

Database Setup

1. Create Cloud SQL Instance

# Create PostgreSQL instance
gcloud sql instances create pdf-postgres \
--database-version=POSTGRES_15 \
--tier=db-custom-2-7680 \
--region=$REGION \
--network=default \
--no-assign-ip \
--enable-bin-log \
--backup-start-time=03:00

# Create database
gcloud sql databases create pdfanalysis \
--instance=pdf-postgres

# Create user
gcloud sql users create pdfuser \
--instance=pdf-postgres \
--password=$(openssl rand -base64 32)

2. Create Redis Instance

# Create Redis instance
gcloud redis instances create pdf-redis \
--size=1 \
--region=$REGION \
--redis-version=redis_6_x \
--tier=basic

3. Create Cloud Storage Bucket

# Create bucket for document storage
gsutil mb -p $PROJECT_ID -c STANDARD -l $REGION gs://$PROJECT_ID-documents

# Set lifecycle policy (delete after 90 days for deleted docs)
cat > lifecycle.json <<EOF
{
"lifecycle": {
"rule": [
{
"action": {"type": "Delete"},
"condition": {"age": 90}
}
]
}
}
EOF

gsutil lifecycle set lifecycle.json gs://$PROJECT_ID-documents

Container Registry

1. Setup Artifact Registry

# Create repository
gcloud artifacts repositories create pdf-analysis \
--repository-format=docker \
--location=$REGION \
--description="PDF Analysis Platform containers"

# Configure Docker authentication
gcloud auth configure-docker $REGION-docker.pkg.dev

2. Build and Push Images

# Build backend
docker build -t $REGION-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:latest \
-f backend/Dockerfile backend/

# Build frontend
docker build -t $REGION-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:latest \
-f frontend/Dockerfile frontend/

# Push images
docker push $REGION-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:latest
docker push $REGION-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:latest

Kubernetes Deployment

1. Create GKE Cluster

# Create cluster
gcloud container clusters create pdf-analysis \
--region=$REGION \
--num-nodes=2 \
--machine-type=n1-standard-2 \
--disk-size=50 \
--enable-autoscaling \
--min-nodes=1 \
--max-nodes=5 \
--enable-autorepair \
--enable-autoupgrade \
--workload-pool=$PROJECT_ID.svc.id.goog

# Get credentials
gcloud container clusters get-credentials pdf-analysis --region=$REGION

2. Create Secrets

# Database credentials
kubectl create secret generic postgres-secrets \
--from-literal=username=pdfuser \
--from-literal=password=YOUR_PASSWORD \
--from-literal=database=pdfanalysis

# JWT secrets
kubectl create secret generic jwt-secrets \
--from-literal=secret-key=$(openssl rand -base64 32) \
--from-literal=refresh-secret-key=$(openssl rand -base64 32)

# Anthropic API key (optional)
kubectl create secret generic anthropic-secret \
--from-literal=api-key=YOUR_ANTHROPIC_API_KEY

3. Deploy Application

# Apply Kubernetes manifests
kubectl apply -f k8s/

# Verify deployments
kubectl get pods
kubectl get services
kubectl get ingress

4. Configure Ingress & SSL

# Reserve static IP
gcloud compute addresses create pdf-analysis-ip \
--global

# Get IP address
gcloud compute addresses describe pdf-analysis-ip --global

# Update DNS records to point to this IP

# Create SSL certificate
gcloud compute ssl-certificates create pdf-analysis-cert \
--domains=yourdomain.com,www.yourdomain.com \
--global

# Apply ingress with SSL
kubectl apply -f k8s/ingress-ssl.yaml

CI/CD with Cloud Build

1. Create Cloud Build Trigger

# Create build trigger for main branch
gcloud builds triggers create github \
--repo-name=coditect-pdf-convertor \
--repo-owner=coditect-ai \
--branch-pattern="^main$" \
--build-config=cloudbuild.yaml

2. Cloud Build Configuration

Create cloudbuild.yaml:

steps:
# Build backend
- name: 'gcr.io/cloud-builders/docker'
args:
- 'build'
- '-t'
- '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:$COMMIT_SHA'
- '-t'
- '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:latest'
- '-f'
- 'backend/Dockerfile'
- 'backend/'

# Build frontend
- name: 'gcr.io/cloud-builders/docker'
args:
- 'build'
- '-t'
- '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:$COMMIT_SHA'
- '-t'
- '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:latest'
- '-f'
- 'frontend/Dockerfile'
- 'frontend/'

# Push backend
- name: 'gcr.io/cloud-builders/docker'
args: ['push', '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:$COMMIT_SHA']

# Push frontend
- name: 'gcr.io/cloud-builders/docker'
args: ['push', '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:$COMMIT_SHA']

# Deploy to GKE
- name: 'gcr.io/cloud-builders/kubectl'
args:
- 'set'
- 'image'
- 'deployment/backend'
- 'backend=${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:$COMMIT_SHA'
env:
- 'CLOUDSDK_COMPUTE_REGION=${_REGION}'
- 'CLOUDSDK_CONTAINER_CLUSTER=pdf-analysis'

- name: 'gcr.io/cloud-builders/kubectl'
args:
- 'set'
- 'image'
- 'deployment/frontend'
- 'frontend=${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:$COMMIT_SHA'
env:
- 'CLOUDSDK_COMPUTE_REGION=${_REGION}'
- 'CLOUDSDK_CONTAINER_CLUSTER=pdf-analysis'

substitutions:
_REGION: us-central1

timeout: 1800s
options:
machineType: 'N1_HIGHCPU_8'

Environment Variables

Backend (.env.production)

# Database
DATABASE_URL=postgresql+asyncpg://pdfuser:PASSWORD@CLOUD_SQL_IP:5432/pdfanalysis

# Redis
REDIS_URL=redis://REDIS_IP:6379

# JWT
JWT_SECRET_KEY=your-secret-key
JWT_REFRESH_SECRET_KEY=your-refresh-secret

# GCP
GCS_BUCKET=pdf-analysis-prod-documents
GOOGLE_APPLICATION_CREDENTIALS=/secrets/gcp-key.json

# Anthropic
ANTHROPIC_API_KEY=sk-ant-api03-xxxxx

# Upload
UPLOAD_DIR=/app/uploads
MAX_FILE_SIZE=52428800

# Environment
ENVIRONMENT=production
DEBUG=false

Frontend (.env.production)

VITE_API_URL=https://api.yourdomain.com
VITE_APP_NAME="PDF Analysis Platform"
VITE_ENVIRONMENT=production

Monitoring & Logging

1. Enable Cloud Logging

# View logs
gcloud logging read "resource.type=k8s_container" --limit 50

# Create log-based metrics
gcloud logging metrics create error_count \
--description="Count of application errors" \
--log-filter='severity="ERROR"'

2. Setup Cloud Monitoring

# Create uptime check
gcloud monitoring uptime create pdf-backend \
--resource-type=uptime-url \
--host=api.yourdomain.com \
--path=/health

3. Application Logs

# Backend logs
kubectl logs -f deployment/backend

# Frontend logs
kubectl logs -f deployment/frontend

# All pods
kubectl logs -f -l app=pdf-analysis

Troubleshooting

Common Issues

Pod not starting:

kubectl describe pod <pod-name>
kubectl logs <pod-name>

Database connection failed:

# Test Cloud SQL connectivity
kubectl run -it --rm debug --image=postgres:15 --restart=Never -- \
psql -h CLOUD_SQL_IP -U pdfuser -d pdfanalysis

Image pull errors:

# Check service account permissions
gcloud projects get-iam-policy $PROJECT_ID

# Re-authenticate Docker
gcloud auth configure-docker $REGION-docker.pkg.dev

SSL certificate issues:

# Check certificate status
gcloud compute ssl-certificates describe pdf-analysis-cert --global

# Check ingress
kubectl describe ingress pdf-analysis

Rollback

# Rollback deployment
kubectl rollout undo deployment/backend
kubectl rollout undo deployment/frontend

# Check rollout status
kubectl rollout status deployment/backend

Backup & Disaster Recovery

Database Backups

# Create on-demand backup
gcloud sql backups create \
--instance=pdf-postgres \
--description="Pre-deployment backup"

# List backups
gcloud sql backups list --instance=pdf-postgres

# Restore from backup
gcloud sql backups restore BACKUP_ID \
--backup-instance=pdf-postgres \
--backup-id=BACKUP_ID

Application Backups

# Backup Kubernetes resources
kubectl get all --all-namespaces -o yaml > backup.yaml

# Backup secrets
kubectl get secrets --all-namespaces -o yaml > secrets-backup.yaml

Cost Optimization

  1. Right-size GKE nodes: Use n1-standard-2 for prod, f1-micro for dev
  2. Enable autoscaling: Scale down during off-hours
  3. Use Preemptible VMs: For non-critical workloads
  4. Cloud SQL optimization: Use smallest instance that meets performance needs
  5. Storage lifecycle: Auto-delete old documents after 90 days
  6. CDN: Enable Cloud CDN for static assets

Security Checklist

  • Enable VPC Service Controls
  • Configure firewall rules
  • Enable audit logging
  • Use Workload Identity
  • Encrypt secrets at rest
  • Enable Binary Authorization
  • Setup DDoS protection
  • Configure CORS properly
  • Enable HTTPS only
  • Regular security scans

Support


Last Updated: 2025-11-02 Copyright: © 2025 AZ1.AI Inc. / Coditect.AI