Skip to main content

C3: Networking Components - VPC and Network Security Architecture

Level: Component (C4 Model Level 3) Scope: VPC Networking, Firewall Rules, Cloud NAT, Load Balancing Primary Audience: Network Engineers, Security Engineers, Platform Architects Last Updated: November 23, 2025


Overview

This diagram shows the detailed networking architecture for CODITECT cloud infrastructure, including VPC configuration, subnets, firewall rules, Cloud NAT, and private service connections.

Key Components:

  • Custom VPC with regional subnets
  • Multi-zone GKE cluster with alias IPs
  • Cloud NAT for egress traffic
  • VPC peering for Cloud SQL and Redis
  • Cloud Armor for DDoS protection
  • Firewall rules for security

Networking Component Diagram


Component Details

1. VPC Network Configuration

VPC Specification:

resource "google_compute_network" "vpc" {
name = "coditect-dev-vpc"
auto_create_subnetworks = false
routing_mode = "REGIONAL"
mtu = 1460

project = "coditect-citus-prod"
}

Network Characteristics:

  • Routing Mode: Regional (lower latency within region)
  • MTU: 1460 bytes (standard for GCP)
  • Auto-create Subnets: Disabled (custom subnet control)
  • IPv6: Not enabled (IPv4 only)

IP Allocation Strategy:

Total VPC CIDR: 10.0.0.0/8 (reserved for future growth)

Current Allocation:
- Primary Subnet (nodes): 10.0.0.0/20 (4,096 IPs)
- Pods (alias IPs): 10.1.0.0/16 (65,536 IPs)
- Services (alias IPs): 10.2.0.0/16 (65,536 IPs)
- Cloud SQL (peering): 10.67.0.0/16 (65,536 IPs - Google managed)
- Redis (peering): 10.121.0.0/16 (65,536 IPs - Google managed)

Future Expansion:
- Staging environment: 10.10.0.0/16
- Production environment: 10.20.0.0/16
- Reserved: 10.30.0.0/16 - 10.255.0.0/16

2. Primary Subnet (GKE Nodes)

Subnet Configuration:

resource "google_compute_subnetwork" "primary" {
name = "coditect-dev-vpc-us-central1"
region = "us-central1"
network = google_compute_network.vpc.id
ip_cidr_range = "10.0.0.0/20" # 4,096 IPs

private_ip_google_access = true

secondary_ip_range {
range_name = "pods"
ip_cidr_range = "10.1.0.0/16" # 65,536 IPs for pods
}

secondary_ip_range {
range_name = "services"
ip_cidr_range = "10.2.0.0/16" # 65,536 IPs for services
}

log_config {
aggregation_interval = "INTERVAL_5_SEC"
flow_sampling = 1.0 # 100% sampling (dev)
metadata = "INCLUDE_ALL_METADATA"
}

project = "coditect-citus-prod"
}

Subnet Features:

  • Private Google Access: Enabled (nodes can reach GCP APIs without public IPs)
  • Flow Logs: Enabled (5-second interval, 100% sampling for dev)
  • Multi-Zone: Spans us-central1-a, us-central1-b, us-central1-c
  • Metadata: Includes all flow log metadata for deep analysis

Node IP Assignment:

us-central1-a nodes: 10.0.0.0/22   (1,024 IPs)
us-central1-b nodes: 10.0.4.0/22 (1,024 IPs)
us-central1-c nodes: 10.0.8.0/22 (1,024 IPs)
Reserved: 10.0.12.0/22 (1,024 IPs for future zones)

3. Alias IP Ranges (GKE Pods & Services)

Pod IP Allocation:

# Managed by GKE (automatic allocation)
# Each node reserves /24 block (256 IPs per node)

Node 1 (us-central1-a): 10.1.0.0/24 (256 pod IPs)
Node 2 (us-central1-a): 10.1.1.0/24 (256 pod IPs)
Node 3 (us-central1-b): 10.1.2.0/24 (256 pod IPs)
Node 4 (us-central1-c): 10.1.3.0/24 (256 pod IPs)
...
Maximum nodes: 256 (limited by /24 allocation per node)

Service IP Allocation:

ClusterIP Services:       10.2.0.0/24   (256 services)
- kube-dns: 10.2.0.10
- license-api: 10.2.0.50
- prometheus: 10.2.0.100
- grafana: 10.2.0.101

LoadBalancer Services: 10.2.1.0/24 (256 services)
- ingress-nginx: 10.2.1.10

Reserved for future: 10.2.2.0/24 - 10.2.255.0/24

Why Alias IPs?

  • VPC-native: Pods get real VPC IPs (not NAT-ed)
  • Performance: Direct routing without iptables overhead
  • Firewall: Can apply VPC firewall rules to pods
  • Peering: Pods can communicate with peered VPCs (Cloud SQL, Redis)

4. Cloud Router & Cloud NAT

Cloud Router Configuration:

resource "google_compute_router" "router" {
name = "coditect-dev-router"
region = "us-central1"
network = google_compute_network.vpc.id

bgp {
asn = 64512 # Private ASN
}

project = "coditect-citus-prod"
}

Cloud NAT Configuration:

resource "google_compute_router_nat" "nat" {
name = "coditect-dev-nat"
router = google_compute_router.router.name
region = "us-central1"

nat_ip_allocate_option = "AUTO_ONLY"
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"

log_config {
enable = true
filter = "ALL" # Log all NAT translations
}

min_ports_per_vm = 64
enable_dynamic_port_allocation = true
tcp_established_idle_timeout_sec = 1200 # 20 minutes
tcp_transitory_idle_timeout_sec = 30 # 30 seconds
tcp_time_wait_timeout_sec = 120 # 2 minutes
udp_idle_timeout_sec = 30 # 30 seconds
icmp_idle_timeout_sec = 30 # 30 seconds

project = "coditect-citus-prod"
}

NAT IP Addresses:

Production: 2 static NAT IPs (manually reserved)
- NAT IP 1: 34.72.XX.XX (primary)
- NAT IP 2: 34.72.YY.YY (backup)

Development: Auto-allocated ephemeral IPs
- Auto-assigned by Google (1-2 IPs based on load)

NAT Use Cases:

  1. Container Image Pulls: GKE nodes pull images from gcr.io
  2. External API Calls: License API calls Stripe, SendGrid
  3. Software Updates: Nodes install security patches
  4. DNS Queries: External DNS resolution (8.8.8.8)

Port Allocation:

Min ports per VM: 64
Max ports per VM: Dynamic (based on connections)

Example:
Node with 10 pods × 20 concurrent connections/pod = 200 connections
Required ports: 200 ports
Allocated: 256 ports (next power of 2)

5. Firewall Rules (Defense in Depth)

Rule Priority:

  • Lower number = higher priority
  • Rules evaluated in priority order
  • First match wins (subsequent rules ignored)

Firewall Rule Hierarchy:

1. Allow Health Checks (Priority 1000)

resource "google_compute_firewall" "allow_health_checks" {
name = "allow-health-checks"
network = google_compute_network.vpc.name
priority = 1000

allow {
protocol = "tcp"
ports = ["8000", "8080"]
}

source_ranges = [
"35.191.0.0/16", # Google Cloud Load Balancer health checks
"130.211.0.0/22", # Legacy health check ranges
]

target_tags = ["gke-node", "allow-health-check"]
}

2. Allow Ingress Controller (Priority 1000)

resource "google_compute_firewall" "allow_ingress" {
name = "allow-ingress-controller"
network = google_compute_network.vpc.name
priority = 1000

allow {
protocol = "tcp"
ports = ["80", "443"]
}

source_ranges = ["0.0.0.0/0"] # Public internet

target_tags = ["ingress-nginx"]
}

3. Allow Internal VPC Traffic (Priority 1000)

resource "google_compute_firewall" "allow_internal" {
name = "allow-internal-vpc"
network = google_compute_network.vpc.name
priority = 1000

allow {
protocol = "tcp"
ports = ["0-65535"]
}

allow {
protocol = "udp"
ports = ["0-65535"]
}

allow {
protocol = "icmp"
}

source_ranges = [
"10.0.0.0/8", # All private VPC ranges
]
}

4. Allow SSH (DEV ONLY - Priority 1000)

resource "google_compute_firewall" "allow_ssh" {
name = "allow-ssh-bastion"
network = google_compute_network.vpc.name
priority = 1000

allow {
protocol = "tcp"
ports = ["22"]
}

source_ranges = ["0.0.0.0/0"] # INSECURE - DEV ONLY

target_tags = ["bastion"]

# TODO: Production should restrict to corporate IP ranges
}

5. Deny All Ingress (Priority 65535 - LOWEST)

resource "google_compute_firewall" "deny_all" {
name = "deny-all-ingress"
network = google_compute_network.vpc.name
priority = 65535 # Lowest priority (last resort)

deny {
protocol = "all"
}

source_ranges = ["0.0.0.0/0"]

# No target tags = applies to all instances
}

Firewall Testing:

# Test from outside VPC (should DENY)
curl -v https://10.0.0.10:8000 # Timeout (blocked by firewall)

# Test from inside VPC (should ALLOW)
gcloud compute ssh gke-node-1 -- curl http://10.2.0.50:8000/health # 200 OK

# Test health check (should ALLOW from Google LB ranges)
curl -v -H "User-Agent: GoogleHC/1.0" http://<external-ip>/health # 200 OK

6. VPC Peering (Private Service Connections)

Cloud SQL Private Service Connection:

resource "google_compute_global_address" "private_ip_address" {
name = "coditect-dev-vpc-private-ip-range"
purpose = "VPC_PEERING"
address_type = "INTERNAL"
prefix_length = 16 # /16 = 65,536 IPs
network = google_compute_network.vpc.id

project = "coditect-citus-prod"
}

resource "google_service_networking_connection" "private_vpc_connection" {
network = google_compute_network.vpc.id
service = "servicenetworking.googleapis.com"
reserved_peering_ranges = [google_compute_global_address.private_ip_address.name]
}

Peering Details:

Cloud SQL Instance:
- Instance Name: coditect-dev
- Private IP: 10.67.0.3
- Port: 5432 (PostgreSQL)
- SSL: Required
- Connection: Direct VPC peering (no NAT, no public IP)

Redis Instance:
- Instance Name: coditect-dev-redis
- Private IP: 10.121.42.67
- Port: 6378
- Auth: Enabled
- Connection: Direct VPC peering

Benefits of VPC Peering:

  • Security: No public IP exposure
  • Performance: Lower latency (no NAT overhead)
  • Cost: No NAT processing fees
  • Simplicity: Direct IP connectivity

7. Cloud Armor (WAF & DDoS Protection)

Security Policy:

resource "google_compute_security_policy" "policy" {
name = "coditect-cloud-armor-policy"

# Rule 1: Rate limiting
rule {
action = "rate_based_ban"
priority = 1000
match {
versioned_expr = "SRC_IPS_V1"
config {
src_ip_ranges = ["*"]
}
}
rate_limit_options {
conform_action = "allow"
exceed_action = "deny(429)"
enforce_on_key = "IP"
ban_duration_sec = 600 # 10-minute ban

rate_limit_threshold {
count = 100
interval_sec = 60 # 100 requests per minute
}
}
}

# Rule 2: Geo-blocking (optional)
rule {
action = "deny(403)"
priority = 2000
match {
expr {
expression = "origin.region_code in ['CN', 'RU', 'KP']"
}
}
description = "Block high-risk countries"
}

# Rule 3: SQL injection prevention
rule {
action = "deny(403)"
priority = 3000
match {
expr {
expression = "evaluatePreconfiguredExpr('sqli-stable')"
}
}
description = "Block SQL injection attempts"
}

# Rule 4: XSS prevention
rule {
action = "deny(403)"
priority = 4000
match {
expr {
expression = "evaluatePreconfiguredExpr('xss-stable')"
}
}
description = "Block XSS attempts"
}

# Default rule: Allow
rule {
action = "allow"
priority = 2147483647 # Max priority (last rule)
match {
versioned_expr = "SRC_IPS_V1"
config {
src_ip_ranges = ["*"]
}
}
}
}

Attack Mitigation:

DDoS Protection:
- Layer 3/4: Google's global infrastructure (automatic)
- Layer 7: Cloud Armor rate limiting (100 req/min per IP)
- Mitigation: 10-minute IP ban after rate limit exceeded

SQL Injection:
- Detection: Preconfigured WAF rules (sqli-stable)
- Action: Deny with 403 Forbidden
- Logging: All blocked requests logged

XSS (Cross-Site Scripting):
- Detection: Preconfigured WAF rules (xss-stable)
- Action: Deny with 403 Forbidden
- Logging: All blocked requests logged

Geo-Blocking:
- Countries: China, Russia, North Korea (optional)
- Action: Deny with 403 Forbidden
- Override: Whitelist specific IPs if needed

8. Load Balancer Configuration

HTTPS Load Balancer:

resource "google_compute_global_forwarding_rule" "https" {
name = "coditect-https-lb"
target = google_compute_target_https_proxy.default.id
port_range = "443"
ip_address = google_compute_global_address.default.address
}

resource "google_compute_target_https_proxy" "default" {
name = "coditect-https-proxy"
url_map = google_compute_url_map.default.id
ssl_certificates = [google_compute_managed_ssl_certificate.default.id]
ssl_policy = google_compute_ssl_policy.modern.id
}

resource "google_compute_ssl_policy" "modern" {
name = "modern-ssl-policy"
profile = "MODERN"
min_tls_version = "TLS_1_3" # Only TLS 1.3
}

resource "google_compute_managed_ssl_certificate" "default" {
name = "coditect-ssl-cert"

managed {
domains = ["api.coditect.ai"]
}
}

Backend Service:

resource "google_compute_backend_service" "default" {
name = "coditect-backend"
port_name = "https"
protocol = "HTTP"
timeout_sec = 30

health_checks = [google_compute_health_check.default.id]

backend {
group = google_compute_instance_group.nginx_ingress.id
balancing_mode = "RATE"
max_rate_per_instance = 100
}

security_policy = google_compute_security_policy.policy.id
enable_cdn = false # API responses not cacheable

log_config {
enable = true
sample_rate = 1.0 # 100% sampling
}
}

Health Checks:

resource "google_compute_health_check" "default" {
name = "coditect-health-check"
check_interval_sec = 10
timeout_sec = 5
healthy_threshold = 2
unhealthy_threshold = 3

http_health_check {
port = 80
request_path = "/health"
}
}

Network Traffic Flows

1. Inbound HTTPS Request (License Acquisition)

1. User (Internet) → HTTPS (443) → Google Cloud Armor
- Rate limit check: 100 req/min per IP
- SQL injection check
- Geo-blocking check

2. Cloud Armor → HTTPS Load Balancer
- SSL termination (TLS 1.3)
- Certificate validation

3. Load Balancer → Backend Service
- Health check (every 10 seconds)
- Load balancing algorithm (round-robin)

4. Backend Service → NGINX Ingress Controller (GKE)
- Target: Ingress controller pods
- Port: 80 (HTTP inside VPC)

5. NGINX Ingress → License API Service (ClusterIP)
- Host: api.coditect.ai
- Path: /api/v1/licenses/acquire
- Target: 10.2.0.50:8000

6. License API Service → License API Pods
- Round-robin to 3+ pods
- Pod IPs: 10.1.0.10, 10.1.0.11, 10.1.0.12

7. License API Pod → Cloud SQL (Private IP)
- Direct VPC peering (no NAT)
- Connection: 10.67.0.3:5432
- SSL: Required

8. License API Pod → Redis (Private IP)
- Direct VPC peering (no NAT)
- Connection: 10.121.42.67:6378
- Auth: Required

Total Latency Breakdown:

Cloud Armor: ~5ms
Load Balancer: ~10ms
NGINX Ingress: ~5ms
License API: ~20ms (application logic)
Cloud SQL: ~5ms (query)
Redis: ~2ms (set TTL)
Total: ~50ms (p95)

2. Outbound API Call (Stripe Payment)

1. License API Pod → Cloud NAT
- Source: Pod IP (10.1.0.10)
- Destination: api.stripe.com (104.16.XX.XX)
- Port: HTTPS (443)

2. Cloud NAT → Internet
- NAT translation: 10.1.0.10 → 34.72.XX.XX (NAT IP)
- Port allocation: 10.1.0.10:12345 → 34.72.XX.XX:54321
- Logging: ALL (for troubleshooting)

3. Stripe API Response → Cloud NAT
- Return path: 34.72.XX.XX:54321 → 10.1.0.10:12345
- NAT reverse translation

4. Cloud NAT → License API Pod
- Delivered to pod (10.1.0.10)

NAT Port Exhaustion Prevention:

Min ports per VM: 64
Dynamic allocation: Up to 65,536 ports per VM
Timeouts:
- TCP established: 1200s (20 minutes)
- TCP transitory: 30s
- Time wait: 120s

Example:
VM with 100 active connections × 20 min avg duration = 2,000 connections/hour
Required ports: ~100 concurrent ports
Allocated: 256 ports (dynamic allocation)
Utilization: 39% (healthy)

3. Internal Pod-to-Pod Communication

1. License API Pod A (10.1.0.10) → License API Pod B (10.1.0.11)
- Direct VPC routing (no NAT)
- Latency: <1ms (same zone), <5ms (cross-zone)

2. License API Pod → CoreDNS (10.2.0.10)
- DNS query for service name
- Query: license-api.default.svc.cluster.local
- Response: 10.2.0.50 (ClusterIP)

3. License API Pod → Prometheus (10.2.0.100)
- Metrics scraping (Prometheus pull model)
- Endpoint: http://10.1.0.10:8000/metrics
- Frequency: Every 15 seconds

Security Best Practices

1. Least Privilege Firewall Rules

Production Recommendations:

# Restrict SSH to corporate VPN only
resource "google_compute_firewall" "allow_ssh_vpn" {
name = "allow-ssh-corporate-vpn"
network = google_compute_network.vpc.name
priority = 1000

allow {
protocol = "tcp"
ports = ["22"]
}

source_ranges = [
"203.0.113.0/24", # Corporate VPN CIDR
]

target_tags = ["bastion"]
}

# Remove allow-ssh-bastion rule (0.0.0.0/0 access)

2. Private GKE Cluster

Recommended Configuration:

resource "google_container_cluster" "primary" {
name = "coditect-prod"
location = "us-central1"

private_cluster_config {
enable_private_nodes = true # No public IPs on nodes
enable_private_endpoint = true # Private Kubernetes API endpoint
master_ipv4_cidr_block = "172.16.0.0/28"
}

master_authorized_networks_config {
cidr_blocks {
cidr_block = "203.0.113.0/24" # Corporate VPN
display_name = "Corporate VPN"
}
}
}

3. Cloud Armor Rate Limiting

Adaptive Rate Limiting:

# Per-IP rate limiting
rate_limit_threshold {
count = 100
interval_sec = 60 # 100 req/min
}

# Per-session rate limiting (using cookie)
rate_limit_options {
enforce_on_key = "HTTP_COOKIE"
enforce_on_key_name = "session_id"
}

# Ban duration
ban_duration_sec = 600 # 10 minutes

Cost Optimization

Network Egress Costs

Pricing (as of 2025):

Egress to Internet:
- First 1GB/month: Free
- 1GB - 10TB/month: $0.12/GB
- 10TB - 150TB/month: $0.11/GB
- 150TB+/month: $0.08/GB

Egress to GCP services (same region): Free
Egress to GCP services (cross-region): $0.01/GB
Egress via Cloud NAT: Additional $0.045/GB

Optimization Strategies:

  1. Minimize External API Calls: Cache Stripe responses, batch SendGrid emails
  2. Use GCP-Native Services: Cloud SQL, Redis in same region (free egress)
  3. CDN for Static Assets: Cache images, JS, CSS at edge (not applicable for API)
  4. Compression: gzip responses (reduces bandwidth by ~70%)

Estimated Egress Costs:

License API responses: ~1KB/request × 100K requests/day = 100MB/day = 3GB/month
Stripe API calls: ~500 bytes × 1K/day = 500KB/day = 15MB/month
SendGrid emails: ~10KB × 1K/day = 10MB/day = 300MB/month
Container image pulls: ~500MB/week = 2GB/month

Total egress: ~5GB/month
Cost: $0.60/month (negligible)

Load Balancer Costs

Pricing:

Forwarding rules: $0.025/hour per rule
Data processing: $0.008/GB processed

Estimated monthly cost:
- 2 forwarding rules (HTTP, HTTPS): $36/month
- 100GB data processed: $0.80/month
- Total: ~$37/month

Disaster Recovery

Multi-Region Failover (Future)

Architecture:

Primary Region: us-central1
DR Region: us-east1

Failover Strategy:
1. Cloud SQL: Cross-region read replica (automatic promotion)
2. Redis: Manual snapshot → restore in DR region
3. GKE: Standby cluster in DR region (minimal nodes)
4. Load Balancer: Multi-region backend (automatic failover)

Recovery Time Objective (RTO): 1 hour
Recovery Point Objective (RPO): 15 minutes (PITR)

Failover Procedure:

# 1. Promote Cloud SQL read replica
gcloud sql instances promote-replica coditect-prod-replica --region=us-east1

# 2. Restore Redis from snapshot
gcloud redis instances create coditect-prod-redis-dr \
--region=us-east1 \
--import-file=gs://coditect-backups/redis-snapshot.rdb

# 3. Scale up DR GKE cluster
gcloud container clusters resize coditect-prod-dr --num-nodes=10 --region=us-east1

# 4. Update DNS (manual or automated)
gcloud dns record-sets transaction start --zone=coditect-prod
gcloud dns record-sets transaction remove --zone=coditect-prod --name=api.coditect.ai --type=A --ttl=300 34.72.XX.XX
gcloud dns record-sets transaction add --zone=coditect-prod --name=api.coditect.ai --type=A --ttl=300 35.190.YY.YY
gcloud dns record-sets transaction execute --zone=coditect-prod

Monitoring & Troubleshooting

VPC Flow Logs

Query Examples (Cloud Logging):

-- Top 10 sources by bytes sent
SELECT
jsonPayload.connection.src_ip,
SUM(CAST(jsonPayload.bytes_sent AS INT64)) as total_bytes
FROM `coditect-citus-prod.logs.compute_googleapis_com_vpc_flows`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
GROUP BY src_ip
ORDER BY total_bytes DESC
LIMIT 10

-- Blocked connections (firewall denied)
SELECT
jsonPayload.connection.src_ip,
jsonPayload.connection.dest_ip,
jsonPayload.connection.dest_port,
jsonPayload.disposition
FROM `coditect-citus-prod.logs.compute_googleapis_com_vpc_flows`
WHERE jsonPayload.disposition = "DENIED"
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)

Cloud NAT Logs

Common Issues:

# Port exhaustion
gcloud logging read "resource.type=nat_gateway AND \
jsonPayload.allocation_status=DROPPED_OUT_OF_RESOURCES" \
--limit=10 --format=json

# Connection timeouts
gcloud logging read "resource.type=nat_gateway AND \
jsonPayload.disposition=TIMEOUT" \
--limit=10 --format=json


Document History

VersionDateAuthorChanges
1.02025-11-23SDD ArchitectInitial networking component diagram

Document Classification: Internal - Architecture Documentation Review Cycle: Quarterly Next Review Date: 2026-02-23