C4 Architecture Diagram - Level 2: Container
CODITECT Context Intelligence Platform
Diagram Level: 2 (Container) Abstraction: Shows high-level technology containers (applications, databases, services) Audience: Technical stakeholders (architects, developers, DevOps) Purpose: Understand the system's runtime components and how they communicate
Container Overview
The Context Intelligence Platform consists of 8 primary containers deployed across two modes (standalone SaaS vs. CODITECT integration). Containers communicate via HTTPS, internal APIs, and message queues.
C4 Level 2 Diagram (Mermaid) - Standalone Mode
C4 Level 2 Diagram (Mermaid) - CODITECT Integration Mode
Container Descriptions
1. Web Application (Standalone Mode Only)
Technology: React 18 + TypeScript + Vite Purpose: User interface for conversation management and search Deployment: Static files served via CDN (CloudFlare)
Key Features:
- Conversation list with infinite scroll
- Hybrid search interface (keyword + semantic + alpha slider)
- Conversation-commit timeline visualization
- Team analytics dashboard
- Real-time updates via WebSocket
Dependencies:
- API Layer (HTTPS/WebSocket)
- CloudFlare CDN (static asset delivery)
Scaling:
- Horizontally scalable (CDN distribution)
- No server-side rendering (pure SPA)
2. API Layer
Standalone: FastAPI 0.104
Technology: FastAPI + Pydantic + Uvicorn (ASGI) Purpose: RESTful API + WebSocket server Deployment: Kubernetes (3+ pods behind load balancer)
Endpoints: 40+ REST endpoints
POST /conversations- Create conversationGET /conversations/search?q={query}- Hybrid searchGET /analytics/team/velocity- Team metricsWS /ws/conversations/{id}- Real-time updates
Authentication: JWT (HS256 or RS256)
Performance:
- Async request handling (10K concurrent connections)
- Connection pooling (PostgreSQL: 20 connections/pod)
- Response caching (Redis: 60s TTL for list endpoints)
CODITECT Integration: Django 4.2 + DRF
Technology: Django + Django REST Framework + Gunicorn (WSGI) Purpose: Integrated Django app within CODITECT platform Deployment: GCP Cloud Run (auto-scaling 0-100 instances)
Endpoints: Same 40+ REST endpoints via DRF ViewSets
Authentication: Django session middleware (shared with CODITECT)
Performance:
- Sync request handling (100 concurrent requests/instance)
- ORM query optimization (select_related, prefetch_related)
- Django cache framework (Redis backend)
3. Business Logic Layer
Technology: Python 3.11 with clean architecture patterns Purpose: Core domain logic (search, correlation, analytics) Structure: Shared between both deployment modes (85% code reuse)
Key Services:
- SearchService - Hybrid search (RRF fusion)
- CorrelationService - Conversation-commit matching
- AnalyticsService - Team productivity insights
- AuthenticationService - JWT or Django session auth
- AuthorizationService - RBAC + feature gating + quotas
Design Patterns:
- Repository pattern (abstract data access)
- Service layer (business logic)
- Dependency injection (constructor-based)
Testing:
- 80%+ unit test coverage
- Integration tests with testcontainers
- Mocked external dependencies
4. Background Workers (Celery)
Technology: Celery 5.3 + Redis (broker + result backend) Purpose: Asynchronous job processing Deployment:
- Standalone: Kubernetes (5 worker pods)
- CODITECT: Cloud Run Jobs (on-demand)
Job Types:
| Job | Priority | Frequency | Duration | Retry |
|---|---|---|---|---|
| Generate Embeddings | High | Per message | 500ms | 3x |
| Process GitHub Webhook | High | Per commit | 200ms | 3x |
| Correlate Conversations | Medium | Daily batch | 5 min | 1x |
| Generate Analytics | Low | Hourly | 10 min | 1x |
| Cleanup Old Data | Low | Weekly | 30 min | 1x |
Configuration:
# celery_config.py
CELERY_BROKER_URL = 'redis://redis:6379/0'
CELERY_RESULT_BACKEND = 'redis://redis:6379/1'
CELERY_TASK_ROUTES = {
'generate_embeddings': {'queue': 'high_priority'},
'process_webhook': {'queue': 'high_priority'},
'correlate_conversations': {'queue': 'default'},
'generate_analytics': {'queue': 'low_priority'},
}
CELERY_TASK_TIME_LIMIT = 300 # 5 minutes
CELERY_TASK_SOFT_TIME_LIMIT = 270 # 4.5 minutes
Monitoring:
- Flower dashboard (task monitoring)
- Prometheus metrics (queue length, task latency)
- Dead letter queue for failed tasks
5. PostgreSQL Database
Technology: PostgreSQL 15 + TimescaleDB extension Purpose: Primary relational database for structured data Deployment:
- Standalone: Managed PostgreSQL (GCP Cloud SQL, 3 replicas)
- CODITECT: Shared database with dedicated schema
Schema:
- 7 core tables (organizations, users, conversations, messages, commits, conversation_commit_links, usage_quotas)
- 12 indexes (optimized for common queries)
- Row-Level Security (RLS) policies for multi-tenancy
- TimescaleDB hypertables for time-series analytics
Size Estimates:
- 1M conversations = ~50GB data
- 50M messages = ~500GB data
- 10M commits = ~25GB data
Backup Strategy:
- Automated daily backups (7-day retention)
- Point-in-time recovery (PITR) enabled
- Cross-region replication for disaster recovery
Connection Pooling:
- PgBouncer (1000 connections → 100 active)
- Application pools: 20 connections per API pod
6. Weaviate Vector Database
Technology: Weaviate 1.23 (cloud-managed) Purpose: Semantic search with vector embeddings Deployment:
- Standalone: Dedicated Weaviate Cloud cluster
- CODITECT: Shared cluster with namespace isolation
Schema:
{
"class": "Conversation",
"properties": [
{"name": "organization_id", "dataType": ["text"], "indexFilterable": True},
{"name": "title", "dataType": ["text"]},
{"name": "content", "dataType": ["text"]}, # Full conversation text
{"name": "created_at", "dataType": ["date"]},
{"name": "message_count", "dataType": ["int"]},
],
"vectorizer": "none", # Manual embeddings from OpenAI
"vectorIndexConfig": {
"distance": "cosine",
"efConstruction": 128,
"maxConnections": 64,
},
"multiTenancyConfig": {"enabled": True} # Native multi-tenancy
}
Multi-Tenancy:
- Each organization = separate tenant
- Automatic tenant isolation in queries
- No cross-tenant data leakage
Performance:
- <50ms p95 for semantic search (10K vectors)
- <100ms p95 for hybrid search (100K vectors)
- Scales to 10M+ vectors
Embedding Model:
- OpenAI
text-embedding-3-large(3072 → 1536 dimensions) - Fallback:
all-MiniLM-L6-v2(384 dimensions, local)
7. Redis Cache & Queue
Technology: Redis 7 (cluster mode for standalone, shared for CODITECT) Purpose: Multi-purpose in-memory data store Deployment:
- Standalone: Managed Redis (GCP Memorystore, 3 replicas)
- CODITECT: Shared Redis with namespace prefixes
Use Cases:
| Use Case | Key Pattern | TTL | Eviction |
|---|---|---|---|
| Session Cache | session:{user_id} | 24 hours | LRU |
| API Response Cache | api:conversations:{org_id}:{page} | 60 seconds | TTL |
| Rate Limiting | ratelimit:{ip}:{endpoint} | 1 minute | TTL |
| Celery Queue | celery:queue:{priority} | N/A | None |
| Celery Results | celery:result:{task_id} | 1 hour | TTL |
Configuration:
maxmemory 4gb
maxmemory-policy allkeys-lru # Evict least recently used
save 900 1 # Snapshot every 15 min if ≥1 key changed
save 300 10 # Snapshot every 5 min if ≥10 keys changed
appendonly yes # Enable AOF for durability
High Availability:
- 3-node cluster (1 primary, 2 replicas)
- Automatic failover (Sentinel)
- Read replicas for scaling
8. Observability Stack (Monitoring)
Technology: Prometheus + Grafana + Loki + Jaeger (standalone only) Purpose: Metrics, logs, and distributed tracing Deployment: Kubernetes (Helm charts)
Components:
Prometheus (Metrics)
- Scrapes metrics from API, Celery, PostgreSQL, Redis
- 15-second scrape interval
- 30-day retention
- Alertmanager for notifications
Key Metrics:
# Request metrics
http_requests_total{method, endpoint, status}
http_request_duration_seconds{method, endpoint, quantile}
# Database metrics
postgres_connections_active
postgres_query_duration_seconds{query_type, quantile}
# Celery metrics
celery_tasks_total{task_name, status}
celery_task_duration_seconds{task_name, quantile}
# Business metrics
conversations_created_total{organization_id}
searches_performed_total{search_type}
Grafana (Dashboards)
- 10 pre-built dashboards
- API performance, database health, Celery queue depth
- Real-time alerting (PagerDuty integration)
Loki (Logs)
- Centralized log aggregation
- 7-day retention (compressed)
- Query language: LogQL
Jaeger (Distributed Tracing)
- End-to-end request tracing
- Identifies performance bottlenecks
- 1% sampling rate in production
Communication Patterns
Synchronous Communication (Request-Response)
| From | To | Protocol | Port | Use Case |
|---|---|---|---|---|
| Web App | API Layer | HTTPS | 443 | User actions |
| API Layer | Business Logic | Python function call | N/A | Service invocation |
| Business Logic | PostgreSQL | PostgreSQL Protocol | 5432 | Data queries |
| Business Logic | Weaviate | GraphQL over HTTP | 8080 | Semantic search |
| Business Logic | Redis | Redis Protocol | 6379 | Cache lookup |
Asynchronous Communication (Message Queue)
| From | To | Queue | Message Type |
|---|---|---|---|
| API Layer | Celery Workers | Redis (high_priority) | Generate embeddings |
| GitHub | API Layer | Webhook (POST) | Commit pushed |
| Business Logic | Celery Workers | Redis (default) | Correlate conversations |
Real-Time Communication (WebSocket)
| From | To | Protocol | Use Case |
|---|---|---|---|
| Web App | API Layer | WebSocket (wss://) | Live conversation updates |
| API Layer | Web App | WebSocket (server push) | Notify: new message added |
Data Flow Examples
Example 1: Create Conversation
1. User → Web App: Clicks "Save Conversation"
2. Web App → API Layer: POST /conversations {title, messages}
3. API Layer → Business Logic: conversation_service.create()
4. Business Logic → PostgreSQL: INSERT INTO conversations (...)
5. PostgreSQL → Business Logic: Returns conversation ID
6. Business Logic → Celery: Queue "generate_embeddings" job
7. Business Logic → API Layer: Returns conversation object
8. API Layer → Web App: 201 Created {id, title, created_at}
9. Celery Worker → OpenAI API: POST /embeddings {text}
10. OpenAI API → Celery Worker: Returns vector [1536 dimensions]
11. Celery Worker → Weaviate: PUT /Conversation/{id} {vector}
12. Celery Worker → PostgreSQL: UPDATE conversations SET embedding_status = 'completed'
Total Latency: ~150ms (user sees confirmation before embedding completes)
Example 2: Hybrid Search
1. User → Web App: Enters "authentication bug" in search
2. Web App → API Layer: GET /conversations/search?q=authentication+bug&alpha=0.5
3. API Layer → Redis: Check cache key "search:org123:authentication+bug:0.5"
4. Redis → API Layer: MISS (no cache)
5. API Layer → Business Logic: search_service.hybrid_search()
6. Business Logic → [PARALLEL]:
a. PostgreSQL: Keyword search (full-text index)
b. Weaviate: Semantic search (vector similarity)
7. Business Logic: RRF fusion (merge results)
8. Business Logic → API Layer: Returns fused results [10 conversations]
9. API Layer → Redis: Cache results (TTL: 60s)
10. API Layer → Web App: 200 OK {items: [...], total: 45}
Total Latency: ~80ms (p50), ~120ms (p95)
Example 3: GitHub Webhook Processing
1. Developer → GitHub: git push origin main
2. GitHub → API Layer: POST /webhooks/github {commits: [...]}
3. API Layer: Verify HMAC signature
4. API Layer → Celery: Queue "process_webhook" job
5. API Layer → GitHub: 200 OK (acknowledge receipt)
6. Celery Worker → Business Logic: webhook_service.process_github_push()
7. Business Logic → PostgreSQL: INSERT INTO commits (...)
8. Business Logic → correlation_service: find_related_conversations()
9. correlation_service → PostgreSQL: Query conversations by timestamp
10. correlation_service → Weaviate: Semantic similarity (commit message vs conversations)
11. correlation_service → PostgreSQL: INSERT INTO conversation_commit_links
12. Business Logic → WebSocket: Notify connected users (new commit linked)
Total Latency: ~500ms (async, user not blocked)
Container Deployment Specifications
Standalone Mode (Kubernetes)
# kubernetes/deployments.yaml
# API Layer
apiVersion: apps/v1
kind: Deployment
metadata:
name: context-api
spec:
replicas: 3
selector:
matchLabels:
app: context-api
template:
metadata:
labels:
app: context-api
spec:
containers:
- name: api
image: gcr.io/coditect/context-api:latest
ports:
- containerPort: 8000
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: postgres-credentials
key: url
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 5
periodSeconds: 10
# Celery Workers
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: context-celery
spec:
replicas: 5
selector:
matchLabels:
app: context-celery
template:
metadata:
labels:
app: context-celery
spec:
containers:
- name: celery
image: gcr.io/coditect/context-celery:latest
command: ["celery", "-A", "core.celery_app", "worker", "--loglevel=info"]
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
CODITECT Integration Mode (Cloud Run)
# cloudrun/service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: context-intelligence
annotations:
run.googleapis.com/ingress: internal # Internal only
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1"
autoscaling.knative.dev/maxScale: "100"
spec:
containers:
- image: gcr.io/coditect/context-django:latest
ports:
- containerPort: 8080
resources:
limits:
memory: 512Mi
cpu: 1000m
env:
- name: DJANGO_SETTINGS_MODULE
value: "core.settings.coditect"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: postgres-credentials
key: url
Container Resource Requirements
| Container | CPU (Request) | CPU (Limit) | Memory (Request) | Memory (Limit) | Replicas/Pods |
|---|---|---|---|---|---|
| API Layer | 500m | 1000m | 512Mi | 1Gi | 3-10 (HPA) |
| Business Logic | N/A | N/A | N/A | N/A | Embedded in API |
| Celery Workers | 250m | 500m | 256Mi | 512Mi | 5-20 (HPA) |
| PostgreSQL | 2000m | 4000m | 4Gi | 8Gi | 1 primary + 2 replicas |
| Weaviate | 1000m | 2000m | 2Gi | 4Gi | 1 (managed) |
| Redis | 500m | 1000m | 1Gi | 2Gi | 1 primary + 2 replicas |
| Monitoring | 1000m | 2000m | 2Gi | 4Gi | 1 (Prometheus + Grafana) |
Total Minimum: ~7 CPUs, ~14GB RAM Total Production: ~20 CPUs, ~35GB RAM
Disaster Recovery & High Availability
Database Backups
| Database | Backup Frequency | Retention | Recovery Time |
|---|---|---|---|
| PostgreSQL | Daily (automated) | 30 days | <1 hour (PITR) |
| Weaviate | Weekly (snapshot) | 4 weeks | <2 hours |
| Redis | Hourly (AOF) | 7 days | <15 minutes |
Failover Strategies
API Layer:
- Load balancer health checks (every 10s)
- Automatic pod replacement (K8s ReplicaSets)
- Zero-downtime deployments (rolling updates)
Databases:
- PostgreSQL: Automatic failover to replica (<30s)
- Redis: Sentinel-managed failover (<10s)
- Weaviate: Cloud-managed HA (99.9% SLA)
Next Level: Component Diagram (C4 Level 3)
The Container diagram shows how the system is deployed (technology choices, containers, databases). The next level (Component Diagram) will show internal components within each container:
- API Layer components (endpoints, middleware, authentication)
- Business Logic components (services, repositories, domain models)
- Background worker components (task handlers, schedulers)
See: c4-l3-component.md
Diagram Maintained By: Architecture Team Last Updated: 2025-11-26 Review Cycle: Quarterly Related Documents: