Deployment Guide - AI-Powered PDF Analysis Platform
This guide covers deploying the PDF Analysis Platform to Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE) and Cloud Build.
Table of Contents
- Prerequisites
- Local Development
- GCP Setup
- Database Setup
- Container Registry
- Kubernetes Deployment
- CI/CD with Cloud Build
- Environment Variables
- Monitoring & Logging
- Troubleshooting
Prerequisites
Required Tools
- Google Cloud SDK (gcloud CLI)
- Docker Desktop
- kubectl
- Node.js (v18+)
- Python (3.11+)
GCP Services
- Google Kubernetes Engine (GKE)
- Cloud SQL (PostgreSQL)
- Cloud Build
- Cloud Storage
- Container Registry / Artifact Registry
- Cloud Memorystore (Redis)
Required Permissions
- Kubernetes Engine Admin
- Cloud SQL Admin
- Cloud Build Editor
- Storage Admin
- Service Account Admin
Local Development
1. Clone Repository
git clone https://github.com/coditect-ai/coditect-pdf-convertor.git
cd coditect-pdf-convertor
2. Start Services
# Start PostgreSQL and Redis
docker-compose up -d postgres redis
# Initialize database
docker-compose run --rm backend python scripts/init_database.py
# Start backend
docker-compose up -d backend
# Start frontend
cd frontend
npm install
npm run dev
3. Access Application
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
GCP Setup
1. Create GCP Project
# Set project ID
export PROJECT_ID="pdf-analysis-prod"
export REGION="us-central1"
export ZONE="us-central1-a"
# Create project
gcloud projects create $PROJECT_ID --name="PDF Analysis Platform"
# Set as default
gcloud config set project $PROJECT_ID
# Enable billing (replace BILLING_ACCOUNT_ID)
gcloud beta billing projects link $PROJECT_ID \
--billing-account=BILLING_ACCOUNT_ID
2. Enable Required APIs
gcloud services enable \
container.googleapis.com \
cloudbuild.googleapis.com \
sql-component.googleapis.com \
sqladmin.googleapis.com \
storage-api.googleapis.com \
redis.googleapis.com \
compute.googleapis.com \
artifactregistry.googleapis.com
3. Create Service Accounts
# Backend service account
gcloud iam service-accounts create pdf-backend \
--display-name="PDF Backend Service"
# Cloud Build service account
gcloud iam service-accounts create pdf-cloudbuild \
--display-name="PDF Cloud Build"
# Grant permissions
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:pdf-backend@$PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/cloudsql.client"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:pdf-backend@$PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"
Database Setup
1. Create Cloud SQL Instance
# Create PostgreSQL instance
gcloud sql instances create pdf-postgres \
--database-version=POSTGRES_15 \
--tier=db-custom-2-7680 \
--region=$REGION \
--network=default \
--no-assign-ip \
--enable-bin-log \
--backup-start-time=03:00
# Create database
gcloud sql databases create pdfanalysis \
--instance=pdf-postgres
# Create user
gcloud sql users create pdfuser \
--instance=pdf-postgres \
--password=$(openssl rand -base64 32)
2. Create Redis Instance
# Create Redis instance
gcloud redis instances create pdf-redis \
--size=1 \
--region=$REGION \
--redis-version=redis_6_x \
--tier=basic
3. Create Cloud Storage Bucket
# Create bucket for document storage
gsutil mb -p $PROJECT_ID -c STANDARD -l $REGION gs://$PROJECT_ID-documents
# Set lifecycle policy (delete after 90 days for deleted docs)
cat > lifecycle.json <<EOF
{
"lifecycle": {
"rule": [
{
"action": {"type": "Delete"},
"condition": {"age": 90}
}
]
}
}
EOF
gsutil lifecycle set lifecycle.json gs://$PROJECT_ID-documents
Container Registry
1. Setup Artifact Registry
# Create repository
gcloud artifacts repositories create pdf-analysis \
--repository-format=docker \
--location=$REGION \
--description="PDF Analysis Platform containers"
# Configure Docker authentication
gcloud auth configure-docker $REGION-docker.pkg.dev
2. Build and Push Images
# Build backend
docker build -t $REGION-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:latest \
-f backend/Dockerfile backend/
# Build frontend
docker build -t $REGION-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:latest \
-f frontend/Dockerfile frontend/
# Push images
docker push $REGION-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:latest
docker push $REGION-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:latest
Kubernetes Deployment
1. Create GKE Cluster
# Create cluster
gcloud container clusters create pdf-analysis \
--region=$REGION \
--num-nodes=2 \
--machine-type=n1-standard-2 \
--disk-size=50 \
--enable-autoscaling \
--min-nodes=1 \
--max-nodes=5 \
--enable-autorepair \
--enable-autoupgrade \
--workload-pool=$PROJECT_ID.svc.id.goog
# Get credentials
gcloud container clusters get-credentials pdf-analysis --region=$REGION
2. Create Secrets
# Database credentials
kubectl create secret generic postgres-secrets \
--from-literal=username=pdfuser \
--from-literal=password=YOUR_PASSWORD \
--from-literal=database=pdfanalysis
# JWT secrets
kubectl create secret generic jwt-secrets \
--from-literal=secret-key=$(openssl rand -base64 32) \
--from-literal=refresh-secret-key=$(openssl rand -base64 32)
# Anthropic API key (optional)
kubectl create secret generic anthropic-secret \
--from-literal=api-key=YOUR_ANTHROPIC_API_KEY
3. Deploy Application
# Apply Kubernetes manifests
kubectl apply -f k8s/
# Verify deployments
kubectl get pods
kubectl get services
kubectl get ingress
4. Configure Ingress & SSL
# Reserve static IP
gcloud compute addresses create pdf-analysis-ip \
--global
# Get IP address
gcloud compute addresses describe pdf-analysis-ip --global
# Update DNS records to point to this IP
# Create SSL certificate
gcloud compute ssl-certificates create pdf-analysis-cert \
--domains=yourdomain.com,www.yourdomain.com \
--global
# Apply ingress with SSL
kubectl apply -f k8s/ingress-ssl.yaml
CI/CD with Cloud Build
1. Create Cloud Build Trigger
# Create build trigger for main branch
gcloud builds triggers create github \
--repo-name=coditect-pdf-convertor \
--repo-owner=coditect-ai \
--branch-pattern="^main$" \
--build-config=cloudbuild.yaml
2. Cloud Build Configuration
Create cloudbuild.yaml:
steps:
# Build backend
- name: 'gcr.io/cloud-builders/docker'
args:
- 'build'
- '-t'
- '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:$COMMIT_SHA'
- '-t'
- '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:latest'
- '-f'
- 'backend/Dockerfile'
- 'backend/'
# Build frontend
- name: 'gcr.io/cloud-builders/docker'
args:
- 'build'
- '-t'
- '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:$COMMIT_SHA'
- '-t'
- '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:latest'
- '-f'
- 'frontend/Dockerfile'
- 'frontend/'
# Push backend
- name: 'gcr.io/cloud-builders/docker'
args: ['push', '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:$COMMIT_SHA']
# Push frontend
- name: 'gcr.io/cloud-builders/docker'
args: ['push', '${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:$COMMIT_SHA']
# Deploy to GKE
- name: 'gcr.io/cloud-builders/kubectl'
args:
- 'set'
- 'image'
- 'deployment/backend'
- 'backend=${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/backend:$COMMIT_SHA'
env:
- 'CLOUDSDK_COMPUTE_REGION=${_REGION}'
- 'CLOUDSDK_CONTAINER_CLUSTER=pdf-analysis'
- name: 'gcr.io/cloud-builders/kubectl'
args:
- 'set'
- 'image'
- 'deployment/frontend'
- 'frontend=${_REGION}-docker.pkg.dev/$PROJECT_ID/pdf-analysis/frontend:$COMMIT_SHA'
env:
- 'CLOUDSDK_COMPUTE_REGION=${_REGION}'
- 'CLOUDSDK_CONTAINER_CLUSTER=pdf-analysis'
substitutions:
_REGION: us-central1
timeout: 1800s
options:
machineType: 'N1_HIGHCPU_8'
Environment Variables
Backend (.env.production)
# Database
DATABASE_URL=postgresql+asyncpg://pdfuser:PASSWORD@CLOUD_SQL_IP:5432/pdfanalysis
# Redis
REDIS_URL=redis://REDIS_IP:6379
# JWT
JWT_SECRET_KEY=your-secret-key
JWT_REFRESH_SECRET_KEY=your-refresh-secret
# GCP
GCS_BUCKET=pdf-analysis-prod-documents
GOOGLE_APPLICATION_CREDENTIALS=/secrets/gcp-key.json
# Anthropic
ANTHROPIC_API_KEY=sk-ant-api03-xxxxx
# Upload
UPLOAD_DIR=/app/uploads
MAX_FILE_SIZE=52428800
# Environment
ENVIRONMENT=production
DEBUG=false
Frontend (.env.production)
VITE_API_URL=https://api.yourdomain.com
VITE_APP_NAME="PDF Analysis Platform"
VITE_ENVIRONMENT=production
Monitoring & Logging
1. Enable Cloud Logging
# View logs
gcloud logging read "resource.type=k8s_container" --limit 50
# Create log-based metrics
gcloud logging metrics create error_count \
--description="Count of application errors" \
--log-filter='severity="ERROR"'
2. Setup Cloud Monitoring
# Create uptime check
gcloud monitoring uptime create pdf-backend \
--resource-type=uptime-url \
--host=api.yourdomain.com \
--path=/health
3. Application Logs
# Backend logs
kubectl logs -f deployment/backend
# Frontend logs
kubectl logs -f deployment/frontend
# All pods
kubectl logs -f -l app=pdf-analysis
Troubleshooting
Common Issues
Pod not starting:
kubectl describe pod <pod-name>
kubectl logs <pod-name>
Database connection failed:
# Test Cloud SQL connectivity
kubectl run -it --rm debug --image=postgres:15 --restart=Never -- \
psql -h CLOUD_SQL_IP -U pdfuser -d pdfanalysis
Image pull errors:
# Check service account permissions
gcloud projects get-iam-policy $PROJECT_ID
# Re-authenticate Docker
gcloud auth configure-docker $REGION-docker.pkg.dev
SSL certificate issues:
# Check certificate status
gcloud compute ssl-certificates describe pdf-analysis-cert --global
# Check ingress
kubectl describe ingress pdf-analysis
Rollback
# Rollback deployment
kubectl rollout undo deployment/backend
kubectl rollout undo deployment/frontend
# Check rollout status
kubectl rollout status deployment/backend
Backup & Disaster Recovery
Database Backups
# Create on-demand backup
gcloud sql backups create \
--instance=pdf-postgres \
--description="Pre-deployment backup"
# List backups
gcloud sql backups list --instance=pdf-postgres
# Restore from backup
gcloud sql backups restore BACKUP_ID \
--backup-instance=pdf-postgres \
--backup-id=BACKUP_ID
Application Backups
# Backup Kubernetes resources
kubectl get all --all-namespaces -o yaml > backup.yaml
# Backup secrets
kubectl get secrets --all-namespaces -o yaml > secrets-backup.yaml
Cost Optimization
- Right-size GKE nodes: Use
n1-standard-2for prod,f1-microfor dev - Enable autoscaling: Scale down during off-hours
- Use Preemptible VMs: For non-critical workloads
- Cloud SQL optimization: Use smallest instance that meets performance needs
- Storage lifecycle: Auto-delete old documents after 90 days
- CDN: Enable Cloud CDN for static assets
Security Checklist
- Enable VPC Service Controls
- Configure firewall rules
- Enable audit logging
- Use Workload Identity
- Encrypt secrets at rest
- Enable Binary Authorization
- Setup DDoS protection
- Configure CORS properly
- Enable HTTPS only
- Regular security scans
Support
- Documentation: readme.md
- API Docs: api-documentation.md
- Issues: https://github.com/coditect-ai/coditect-pdf-convertor/issues
- Email: 1@az1.ai
Last Updated: 2025-11-02 Copyright: © 2025 AZ1.AI Inc. / Coditect.AI