C3: Networking Components - VPC and Network Security Architecture
Level: Component (C4 Model Level 3) Scope: VPC Networking, Firewall Rules, Cloud NAT, Load Balancing Primary Audience: Network Engineers, Security Engineers, Platform Architects Last Updated: November 23, 2025
Overview
This diagram shows the detailed networking architecture for CODITECT cloud infrastructure, including VPC configuration, subnets, firewall rules, Cloud NAT, and private service connections.
Key Components:
- Custom VPC with regional subnets
- Multi-zone GKE cluster with alias IPs
- Cloud NAT for egress traffic
- VPC peering for Cloud SQL and Redis
- Cloud Armor for DDoS protection
- Firewall rules for security
Networking Component Diagram
Component Details
1. VPC Network Configuration
VPC Specification:
resource "google_compute_network" "vpc" {
name = "coditect-dev-vpc"
auto_create_subnetworks = false
routing_mode = "REGIONAL"
mtu = 1460
project = "coditect-citus-prod"
}
Network Characteristics:
- Routing Mode: Regional (lower latency within region)
- MTU: 1460 bytes (standard for GCP)
- Auto-create Subnets: Disabled (custom subnet control)
- IPv6: Not enabled (IPv4 only)
IP Allocation Strategy:
Total VPC CIDR: 10.0.0.0/8 (reserved for future growth)
Current Allocation:
- Primary Subnet (nodes): 10.0.0.0/20 (4,096 IPs)
- Pods (alias IPs): 10.1.0.0/16 (65,536 IPs)
- Services (alias IPs): 10.2.0.0/16 (65,536 IPs)
- Cloud SQL (peering): 10.67.0.0/16 (65,536 IPs - Google managed)
- Redis (peering): 10.121.0.0/16 (65,536 IPs - Google managed)
Future Expansion:
- Staging environment: 10.10.0.0/16
- Production environment: 10.20.0.0/16
- Reserved: 10.30.0.0/16 - 10.255.0.0/16
2. Primary Subnet (GKE Nodes)
Subnet Configuration:
resource "google_compute_subnetwork" "primary" {
name = "coditect-dev-vpc-us-central1"
region = "us-central1"
network = google_compute_network.vpc.id
ip_cidr_range = "10.0.0.0/20" # 4,096 IPs
private_ip_google_access = true
secondary_ip_range {
range_name = "pods"
ip_cidr_range = "10.1.0.0/16" # 65,536 IPs for pods
}
secondary_ip_range {
range_name = "services"
ip_cidr_range = "10.2.0.0/16" # 65,536 IPs for services
}
log_config {
aggregation_interval = "INTERVAL_5_SEC"
flow_sampling = 1.0 # 100% sampling (dev)
metadata = "INCLUDE_ALL_METADATA"
}
project = "coditect-citus-prod"
}
Subnet Features:
- Private Google Access: Enabled (nodes can reach GCP APIs without public IPs)
- Flow Logs: Enabled (5-second interval, 100% sampling for dev)
- Multi-Zone: Spans us-central1-a, us-central1-b, us-central1-c
- Metadata: Includes all flow log metadata for deep analysis
Node IP Assignment:
us-central1-a nodes: 10.0.0.0/22 (1,024 IPs)
us-central1-b nodes: 10.0.4.0/22 (1,024 IPs)
us-central1-c nodes: 10.0.8.0/22 (1,024 IPs)
Reserved: 10.0.12.0/22 (1,024 IPs for future zones)
3. Alias IP Ranges (GKE Pods & Services)
Pod IP Allocation:
# Managed by GKE (automatic allocation)
# Each node reserves /24 block (256 IPs per node)
Node 1 (us-central1-a): 10.1.0.0/24 (256 pod IPs)
Node 2 (us-central1-a): 10.1.1.0/24 (256 pod IPs)
Node 3 (us-central1-b): 10.1.2.0/24 (256 pod IPs)
Node 4 (us-central1-c): 10.1.3.0/24 (256 pod IPs)
...
Maximum nodes: 256 (limited by /24 allocation per node)
Service IP Allocation:
ClusterIP Services: 10.2.0.0/24 (256 services)
- kube-dns: 10.2.0.10
- license-api: 10.2.0.50
- prometheus: 10.2.0.100
- grafana: 10.2.0.101
LoadBalancer Services: 10.2.1.0/24 (256 services)
- ingress-nginx: 10.2.1.10
Reserved for future: 10.2.2.0/24 - 10.2.255.0/24
Why Alias IPs?
- VPC-native: Pods get real VPC IPs (not NAT-ed)
- Performance: Direct routing without iptables overhead
- Firewall: Can apply VPC firewall rules to pods
- Peering: Pods can communicate with peered VPCs (Cloud SQL, Redis)
4. Cloud Router & Cloud NAT
Cloud Router Configuration:
resource "google_compute_router" "router" {
name = "coditect-dev-router"
region = "us-central1"
network = google_compute_network.vpc.id
bgp {
asn = 64512 # Private ASN
}
project = "coditect-citus-prod"
}
Cloud NAT Configuration:
resource "google_compute_router_nat" "nat" {
name = "coditect-dev-nat"
router = google_compute_router.router.name
region = "us-central1"
nat_ip_allocate_option = "AUTO_ONLY"
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
log_config {
enable = true
filter = "ALL" # Log all NAT translations
}
min_ports_per_vm = 64
enable_dynamic_port_allocation = true
tcp_established_idle_timeout_sec = 1200 # 20 minutes
tcp_transitory_idle_timeout_sec = 30 # 30 seconds
tcp_time_wait_timeout_sec = 120 # 2 minutes
udp_idle_timeout_sec = 30 # 30 seconds
icmp_idle_timeout_sec = 30 # 30 seconds
project = "coditect-citus-prod"
}
NAT IP Addresses:
Production: 2 static NAT IPs (manually reserved)
- NAT IP 1: 34.72.XX.XX (primary)
- NAT IP 2: 34.72.YY.YY (backup)
Development: Auto-allocated ephemeral IPs
- Auto-assigned by Google (1-2 IPs based on load)
NAT Use Cases:
- Container Image Pulls: GKE nodes pull images from gcr.io
- External API Calls: License API calls Stripe, SendGrid
- Software Updates: Nodes install security patches
- DNS Queries: External DNS resolution (8.8.8.8)
Port Allocation:
Min ports per VM: 64
Max ports per VM: Dynamic (based on connections)
Example:
Node with 10 pods × 20 concurrent connections/pod = 200 connections
Required ports: 200 ports
Allocated: 256 ports (next power of 2)
5. Firewall Rules (Defense in Depth)
Rule Priority:
- Lower number = higher priority
- Rules evaluated in priority order
- First match wins (subsequent rules ignored)
Firewall Rule Hierarchy:
1. Allow Health Checks (Priority 1000)
resource "google_compute_firewall" "allow_health_checks" {
name = "allow-health-checks"
network = google_compute_network.vpc.name
priority = 1000
allow {
protocol = "tcp"
ports = ["8000", "8080"]
}
source_ranges = [
"35.191.0.0/16", # Google Cloud Load Balancer health checks
"130.211.0.0/22", # Legacy health check ranges
]
target_tags = ["gke-node", "allow-health-check"]
}
2. Allow Ingress Controller (Priority 1000)
resource "google_compute_firewall" "allow_ingress" {
name = "allow-ingress-controller"
network = google_compute_network.vpc.name
priority = 1000
allow {
protocol = "tcp"
ports = ["80", "443"]
}
source_ranges = ["0.0.0.0/0"] # Public internet
target_tags = ["ingress-nginx"]
}
3. Allow Internal VPC Traffic (Priority 1000)
resource "google_compute_firewall" "allow_internal" {
name = "allow-internal-vpc"
network = google_compute_network.vpc.name
priority = 1000
allow {
protocol = "tcp"
ports = ["0-65535"]
}
allow {
protocol = "udp"
ports = ["0-65535"]
}
allow {
protocol = "icmp"
}
source_ranges = [
"10.0.0.0/8", # All private VPC ranges
]
}
4. Allow SSH (DEV ONLY - Priority 1000)
resource "google_compute_firewall" "allow_ssh" {
name = "allow-ssh-bastion"
network = google_compute_network.vpc.name
priority = 1000
allow {
protocol = "tcp"
ports = ["22"]
}
source_ranges = ["0.0.0.0/0"] # INSECURE - DEV ONLY
target_tags = ["bastion"]
# TODO: Production should restrict to corporate IP ranges
}
5. Deny All Ingress (Priority 65535 - LOWEST)
resource "google_compute_firewall" "deny_all" {
name = "deny-all-ingress"
network = google_compute_network.vpc.name
priority = 65535 # Lowest priority (last resort)
deny {
protocol = "all"
}
source_ranges = ["0.0.0.0/0"]
# No target tags = applies to all instances
}
Firewall Testing:
# Test from outside VPC (should DENY)
curl -v https://10.0.0.10:8000 # Timeout (blocked by firewall)
# Test from inside VPC (should ALLOW)
gcloud compute ssh gke-node-1 -- curl http://10.2.0.50:8000/health # 200 OK
# Test health check (should ALLOW from Google LB ranges)
curl -v -H "User-Agent: GoogleHC/1.0" http://<external-ip>/health # 200 OK
6. VPC Peering (Private Service Connections)
Cloud SQL Private Service Connection:
resource "google_compute_global_address" "private_ip_address" {
name = "coditect-dev-vpc-private-ip-range"
purpose = "VPC_PEERING"
address_type = "INTERNAL"
prefix_length = 16 # /16 = 65,536 IPs
network = google_compute_network.vpc.id
project = "coditect-citus-prod"
}
resource "google_service_networking_connection" "private_vpc_connection" {
network = google_compute_network.vpc.id
service = "servicenetworking.googleapis.com"
reserved_peering_ranges = [google_compute_global_address.private_ip_address.name]
}
Peering Details:
Cloud SQL Instance:
- Instance Name: coditect-dev
- Private IP: 10.67.0.3
- Port: 5432 (PostgreSQL)
- SSL: Required
- Connection: Direct VPC peering (no NAT, no public IP)
Redis Instance:
- Instance Name: coditect-dev-redis
- Private IP: 10.121.42.67
- Port: 6378
- Auth: Enabled
- Connection: Direct VPC peering
Benefits of VPC Peering:
- Security: No public IP exposure
- Performance: Lower latency (no NAT overhead)
- Cost: No NAT processing fees
- Simplicity: Direct IP connectivity
7. Cloud Armor (WAF & DDoS Protection)
Security Policy:
resource "google_compute_security_policy" "policy" {
name = "coditect-cloud-armor-policy"
# Rule 1: Rate limiting
rule {
action = "rate_based_ban"
priority = 1000
match {
versioned_expr = "SRC_IPS_V1"
config {
src_ip_ranges = ["*"]
}
}
rate_limit_options {
conform_action = "allow"
exceed_action = "deny(429)"
enforce_on_key = "IP"
ban_duration_sec = 600 # 10-minute ban
rate_limit_threshold {
count = 100
interval_sec = 60 # 100 requests per minute
}
}
}
# Rule 2: Geo-blocking (optional)
rule {
action = "deny(403)"
priority = 2000
match {
expr {
expression = "origin.region_code in ['CN', 'RU', 'KP']"
}
}
description = "Block high-risk countries"
}
# Rule 3: SQL injection prevention
rule {
action = "deny(403)"
priority = 3000
match {
expr {
expression = "evaluatePreconfiguredExpr('sqli-stable')"
}
}
description = "Block SQL injection attempts"
}
# Rule 4: XSS prevention
rule {
action = "deny(403)"
priority = 4000
match {
expr {
expression = "evaluatePreconfiguredExpr('xss-stable')"
}
}
description = "Block XSS attempts"
}
# Default rule: Allow
rule {
action = "allow"
priority = 2147483647 # Max priority (last rule)
match {
versioned_expr = "SRC_IPS_V1"
config {
src_ip_ranges = ["*"]
}
}
}
}
Attack Mitigation:
DDoS Protection:
- Layer 3/4: Google's global infrastructure (automatic)
- Layer 7: Cloud Armor rate limiting (100 req/min per IP)
- Mitigation: 10-minute IP ban after rate limit exceeded
SQL Injection:
- Detection: Preconfigured WAF rules (sqli-stable)
- Action: Deny with 403 Forbidden
- Logging: All blocked requests logged
XSS (Cross-Site Scripting):
- Detection: Preconfigured WAF rules (xss-stable)
- Action: Deny with 403 Forbidden
- Logging: All blocked requests logged
Geo-Blocking:
- Countries: China, Russia, North Korea (optional)
- Action: Deny with 403 Forbidden
- Override: Whitelist specific IPs if needed
8. Load Balancer Configuration
HTTPS Load Balancer:
resource "google_compute_global_forwarding_rule" "https" {
name = "coditect-https-lb"
target = google_compute_target_https_proxy.default.id
port_range = "443"
ip_address = google_compute_global_address.default.address
}
resource "google_compute_target_https_proxy" "default" {
name = "coditect-https-proxy"
url_map = google_compute_url_map.default.id
ssl_certificates = [google_compute_managed_ssl_certificate.default.id]
ssl_policy = google_compute_ssl_policy.modern.id
}
resource "google_compute_ssl_policy" "modern" {
name = "modern-ssl-policy"
profile = "MODERN"
min_tls_version = "TLS_1_3" # Only TLS 1.3
}
resource "google_compute_managed_ssl_certificate" "default" {
name = "coditect-ssl-cert"
managed {
domains = ["api.coditect.ai"]
}
}
Backend Service:
resource "google_compute_backend_service" "default" {
name = "coditect-backend"
port_name = "https"
protocol = "HTTP"
timeout_sec = 30
health_checks = [google_compute_health_check.default.id]
backend {
group = google_compute_instance_group.nginx_ingress.id
balancing_mode = "RATE"
max_rate_per_instance = 100
}
security_policy = google_compute_security_policy.policy.id
enable_cdn = false # API responses not cacheable
log_config {
enable = true
sample_rate = 1.0 # 100% sampling
}
}
Health Checks:
resource "google_compute_health_check" "default" {
name = "coditect-health-check"
check_interval_sec = 10
timeout_sec = 5
healthy_threshold = 2
unhealthy_threshold = 3
http_health_check {
port = 80
request_path = "/health"
}
}
Network Traffic Flows
1. Inbound HTTPS Request (License Acquisition)
1. User (Internet) → HTTPS (443) → Google Cloud Armor
- Rate limit check: 100 req/min per IP
- SQL injection check
- Geo-blocking check
2. Cloud Armor → HTTPS Load Balancer
- SSL termination (TLS 1.3)
- Certificate validation
3. Load Balancer → Backend Service
- Health check (every 10 seconds)
- Load balancing algorithm (round-robin)
4. Backend Service → NGINX Ingress Controller (GKE)
- Target: Ingress controller pods
- Port: 80 (HTTP inside VPC)
5. NGINX Ingress → License API Service (ClusterIP)
- Host: api.coditect.ai
- Path: /api/v1/licenses/acquire
- Target: 10.2.0.50:8000
6. License API Service → License API Pods
- Round-robin to 3+ pods
- Pod IPs: 10.1.0.10, 10.1.0.11, 10.1.0.12
7. License API Pod → Cloud SQL (Private IP)
- Direct VPC peering (no NAT)
- Connection: 10.67.0.3:5432
- SSL: Required
8. License API Pod → Redis (Private IP)
- Direct VPC peering (no NAT)
- Connection: 10.121.42.67:6378
- Auth: Required
Total Latency Breakdown:
Cloud Armor: ~5ms
Load Balancer: ~10ms
NGINX Ingress: ~5ms
License API: ~20ms (application logic)
Cloud SQL: ~5ms (query)
Redis: ~2ms (set TTL)
Total: ~50ms (p95)
2. Outbound API Call (Stripe Payment)
1. License API Pod → Cloud NAT
- Source: Pod IP (10.1.0.10)
- Destination: api.stripe.com (104.16.XX.XX)
- Port: HTTPS (443)
2. Cloud NAT → Internet
- NAT translation: 10.1.0.10 → 34.72.XX.XX (NAT IP)
- Port allocation: 10.1.0.10:12345 → 34.72.XX.XX:54321
- Logging: ALL (for troubleshooting)
3. Stripe API Response → Cloud NAT
- Return path: 34.72.XX.XX:54321 → 10.1.0.10:12345
- NAT reverse translation
4. Cloud NAT → License API Pod
- Delivered to pod (10.1.0.10)
NAT Port Exhaustion Prevention:
Min ports per VM: 64
Dynamic allocation: Up to 65,536 ports per VM
Timeouts:
- TCP established: 1200s (20 minutes)
- TCP transitory: 30s
- Time wait: 120s
Example:
VM with 100 active connections × 20 min avg duration = 2,000 connections/hour
Required ports: ~100 concurrent ports
Allocated: 256 ports (dynamic allocation)
Utilization: 39% (healthy)
3. Internal Pod-to-Pod Communication
1. License API Pod A (10.1.0.10) → License API Pod B (10.1.0.11)
- Direct VPC routing (no NAT)
- Latency: <1ms (same zone), <5ms (cross-zone)
2. License API Pod → CoreDNS (10.2.0.10)
- DNS query for service name
- Query: license-api.default.svc.cluster.local
- Response: 10.2.0.50 (ClusterIP)
3. License API Pod → Prometheus (10.2.0.100)
- Metrics scraping (Prometheus pull model)
- Endpoint: http://10.1.0.10:8000/metrics
- Frequency: Every 15 seconds
Security Best Practices
1. Least Privilege Firewall Rules
Production Recommendations:
# Restrict SSH to corporate VPN only
resource "google_compute_firewall" "allow_ssh_vpn" {
name = "allow-ssh-corporate-vpn"
network = google_compute_network.vpc.name
priority = 1000
allow {
protocol = "tcp"
ports = ["22"]
}
source_ranges = [
"203.0.113.0/24", # Corporate VPN CIDR
]
target_tags = ["bastion"]
}
# Remove allow-ssh-bastion rule (0.0.0.0/0 access)
2. Private GKE Cluster
Recommended Configuration:
resource "google_container_cluster" "primary" {
name = "coditect-prod"
location = "us-central1"
private_cluster_config {
enable_private_nodes = true # No public IPs on nodes
enable_private_endpoint = true # Private Kubernetes API endpoint
master_ipv4_cidr_block = "172.16.0.0/28"
}
master_authorized_networks_config {
cidr_blocks {
cidr_block = "203.0.113.0/24" # Corporate VPN
display_name = "Corporate VPN"
}
}
}
3. Cloud Armor Rate Limiting
Adaptive Rate Limiting:
# Per-IP rate limiting
rate_limit_threshold {
count = 100
interval_sec = 60 # 100 req/min
}
# Per-session rate limiting (using cookie)
rate_limit_options {
enforce_on_key = "HTTP_COOKIE"
enforce_on_key_name = "session_id"
}
# Ban duration
ban_duration_sec = 600 # 10 minutes
Cost Optimization
Network Egress Costs
Pricing (as of 2025):
Egress to Internet:
- First 1GB/month: Free
- 1GB - 10TB/month: $0.12/GB
- 10TB - 150TB/month: $0.11/GB
- 150TB+/month: $0.08/GB
Egress to GCP services (same region): Free
Egress to GCP services (cross-region): $0.01/GB
Egress via Cloud NAT: Additional $0.045/GB
Optimization Strategies:
- Minimize External API Calls: Cache Stripe responses, batch SendGrid emails
- Use GCP-Native Services: Cloud SQL, Redis in same region (free egress)
- CDN for Static Assets: Cache images, JS, CSS at edge (not applicable for API)
- Compression: gzip responses (reduces bandwidth by ~70%)
Estimated Egress Costs:
License API responses: ~1KB/request × 100K requests/day = 100MB/day = 3GB/month
Stripe API calls: ~500 bytes × 1K/day = 500KB/day = 15MB/month
SendGrid emails: ~10KB × 1K/day = 10MB/day = 300MB/month
Container image pulls: ~500MB/week = 2GB/month
Total egress: ~5GB/month
Cost: $0.60/month (negligible)
Load Balancer Costs
Pricing:
Forwarding rules: $0.025/hour per rule
Data processing: $0.008/GB processed
Estimated monthly cost:
- 2 forwarding rules (HTTP, HTTPS): $36/month
- 100GB data processed: $0.80/month
- Total: ~$37/month
Disaster Recovery
Multi-Region Failover (Future)
Architecture:
Primary Region: us-central1
DR Region: us-east1
Failover Strategy:
1. Cloud SQL: Cross-region read replica (automatic promotion)
2. Redis: Manual snapshot → restore in DR region
3. GKE: Standby cluster in DR region (minimal nodes)
4. Load Balancer: Multi-region backend (automatic failover)
Recovery Time Objective (RTO): 1 hour
Recovery Point Objective (RPO): 15 minutes (PITR)
Failover Procedure:
# 1. Promote Cloud SQL read replica
gcloud sql instances promote-replica coditect-prod-replica --region=us-east1
# 2. Restore Redis from snapshot
gcloud redis instances create coditect-prod-redis-dr \
--region=us-east1 \
--import-file=gs://coditect-backups/redis-snapshot.rdb
# 3. Scale up DR GKE cluster
gcloud container clusters resize coditect-prod-dr --num-nodes=10 --region=us-east1
# 4. Update DNS (manual or automated)
gcloud dns record-sets transaction start --zone=coditect-prod
gcloud dns record-sets transaction remove --zone=coditect-prod --name=api.coditect.ai --type=A --ttl=300 34.72.XX.XX
gcloud dns record-sets transaction add --zone=coditect-prod --name=api.coditect.ai --type=A --ttl=300 35.190.YY.YY
gcloud dns record-sets transaction execute --zone=coditect-prod
Monitoring & Troubleshooting
VPC Flow Logs
Query Examples (Cloud Logging):
-- Top 10 sources by bytes sent
SELECT
jsonPayload.connection.src_ip,
SUM(CAST(jsonPayload.bytes_sent AS INT64)) as total_bytes
FROM `coditect-citus-prod.logs.compute_googleapis_com_vpc_flows`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
GROUP BY src_ip
ORDER BY total_bytes DESC
LIMIT 10
-- Blocked connections (firewall denied)
SELECT
jsonPayload.connection.src_ip,
jsonPayload.connection.dest_ip,
jsonPayload.connection.dest_port,
jsonPayload.disposition
FROM `coditect-citus-prod.logs.compute_googleapis_com_vpc_flows`
WHERE jsonPayload.disposition = "DENIED"
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
Cloud NAT Logs
Common Issues:
# Port exhaustion
gcloud logging read "resource.type=nat_gateway AND \
jsonPayload.allocation_status=DROPPED_OUT_OF_RESOURCES" \
--limit=10 --format=json
# Connection timeouts
gcloud logging read "resource.type=nat_gateway AND \
jsonPayload.disposition=TIMEOUT" \
--limit=10 --format=json
Related Diagrams
- C1: System Context - External system view
- C2: Container Diagram - High-level containers
- C3: GKE Components - Kubernetes internals
- C3: Security Components - Security architecture
Document History
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2025-11-23 | SDD Architect | Initial networking component diagram |
Document Classification: Internal - Architecture Documentation Review Cycle: Quarterly Next Review Date: 2026-02-23