Skip to main content

C4 Architecture Diagram - Level 2: Container

CODITECT Context Intelligence Platform

Diagram Level: 2 (Container) Abstraction: Shows high-level technology containers (applications, databases, services) Audience: Technical stakeholders (architects, developers, DevOps) Purpose: Understand the system's runtime components and how they communicate


Container Overview

The Context Intelligence Platform consists of 8 primary containers deployed across two modes (standalone SaaS vs. CODITECT integration). Containers communicate via HTTPS, internal APIs, and message queues.


C4 Level 2 Diagram (Mermaid) - Standalone Mode


C4 Level 2 Diagram (Mermaid) - CODITECT Integration Mode


Container Descriptions

1. Web Application (Standalone Mode Only)

Technology: React 18 + TypeScript + Vite Purpose: User interface for conversation management and search Deployment: Static files served via CDN (CloudFlare)

Key Features:

  • Conversation list with infinite scroll
  • Hybrid search interface (keyword + semantic + alpha slider)
  • Conversation-commit timeline visualization
  • Team analytics dashboard
  • Real-time updates via WebSocket

Dependencies:

  • API Layer (HTTPS/WebSocket)
  • CloudFlare CDN (static asset delivery)

Scaling:

  • Horizontally scalable (CDN distribution)
  • No server-side rendering (pure SPA)

2. API Layer

Standalone: FastAPI 0.104

Technology: FastAPI + Pydantic + Uvicorn (ASGI) Purpose: RESTful API + WebSocket server Deployment: Kubernetes (3+ pods behind load balancer)

Endpoints: 40+ REST endpoints

  • POST /conversations - Create conversation
  • GET /conversations/search?q={query} - Hybrid search
  • GET /analytics/team/velocity - Team metrics
  • WS /ws/conversations/{id} - Real-time updates

Authentication: JWT (HS256 or RS256)

Performance:

  • Async request handling (10K concurrent connections)
  • Connection pooling (PostgreSQL: 20 connections/pod)
  • Response caching (Redis: 60s TTL for list endpoints)

CODITECT Integration: Django 4.2 + DRF

Technology: Django + Django REST Framework + Gunicorn (WSGI) Purpose: Integrated Django app within CODITECT platform Deployment: GCP Cloud Run (auto-scaling 0-100 instances)

Endpoints: Same 40+ REST endpoints via DRF ViewSets

Authentication: Django session middleware (shared with CODITECT)

Performance:

  • Sync request handling (100 concurrent requests/instance)
  • ORM query optimization (select_related, prefetch_related)
  • Django cache framework (Redis backend)

3. Business Logic Layer

Technology: Python 3.11 with clean architecture patterns Purpose: Core domain logic (search, correlation, analytics) Structure: Shared between both deployment modes (85% code reuse)

Key Services:

  • SearchService - Hybrid search (RRF fusion)
  • CorrelationService - Conversation-commit matching
  • AnalyticsService - Team productivity insights
  • AuthenticationService - JWT or Django session auth
  • AuthorizationService - RBAC + feature gating + quotas

Design Patterns:

  • Repository pattern (abstract data access)
  • Service layer (business logic)
  • Dependency injection (constructor-based)

Testing:

  • 80%+ unit test coverage
  • Integration tests with testcontainers
  • Mocked external dependencies

4. Background Workers (Celery)

Technology: Celery 5.3 + Redis (broker + result backend) Purpose: Asynchronous job processing Deployment:

  • Standalone: Kubernetes (5 worker pods)
  • CODITECT: Cloud Run Jobs (on-demand)

Job Types:

JobPriorityFrequencyDurationRetry
Generate EmbeddingsHighPer message500ms3x
Process GitHub WebhookHighPer commit200ms3x
Correlate ConversationsMediumDaily batch5 min1x
Generate AnalyticsLowHourly10 min1x
Cleanup Old DataLowWeekly30 min1x

Configuration:

# celery_config.py
CELERY_BROKER_URL = 'redis://redis:6379/0'
CELERY_RESULT_BACKEND = 'redis://redis:6379/1'
CELERY_TASK_ROUTES = {
'generate_embeddings': {'queue': 'high_priority'},
'process_webhook': {'queue': 'high_priority'},
'correlate_conversations': {'queue': 'default'},
'generate_analytics': {'queue': 'low_priority'},
}
CELERY_TASK_TIME_LIMIT = 300 # 5 minutes
CELERY_TASK_SOFT_TIME_LIMIT = 270 # 4.5 minutes

Monitoring:

  • Flower dashboard (task monitoring)
  • Prometheus metrics (queue length, task latency)
  • Dead letter queue for failed tasks

5. PostgreSQL Database

Technology: PostgreSQL 15 + TimescaleDB extension Purpose: Primary relational database for structured data Deployment:

  • Standalone: Managed PostgreSQL (GCP Cloud SQL, 3 replicas)
  • CODITECT: Shared database with dedicated schema

Schema:

  • 7 core tables (organizations, users, conversations, messages, commits, conversation_commit_links, usage_quotas)
  • 12 indexes (optimized for common queries)
  • Row-Level Security (RLS) policies for multi-tenancy
  • TimescaleDB hypertables for time-series analytics

Size Estimates:

  • 1M conversations = ~50GB data
  • 50M messages = ~500GB data
  • 10M commits = ~25GB data

Backup Strategy:

  • Automated daily backups (7-day retention)
  • Point-in-time recovery (PITR) enabled
  • Cross-region replication for disaster recovery

Connection Pooling:

  • PgBouncer (1000 connections → 100 active)
  • Application pools: 20 connections per API pod

6. Weaviate Vector Database

Technology: Weaviate 1.23 (cloud-managed) Purpose: Semantic search with vector embeddings Deployment:

  • Standalone: Dedicated Weaviate Cloud cluster
  • CODITECT: Shared cluster with namespace isolation

Schema:

{
"class": "Conversation",
"properties": [
{"name": "organization_id", "dataType": ["text"], "indexFilterable": True},
{"name": "title", "dataType": ["text"]},
{"name": "content", "dataType": ["text"]}, # Full conversation text
{"name": "created_at", "dataType": ["date"]},
{"name": "message_count", "dataType": ["int"]},
],
"vectorizer": "none", # Manual embeddings from OpenAI
"vectorIndexConfig": {
"distance": "cosine",
"efConstruction": 128,
"maxConnections": 64,
},
"multiTenancyConfig": {"enabled": True} # Native multi-tenancy
}

Multi-Tenancy:

  • Each organization = separate tenant
  • Automatic tenant isolation in queries
  • No cross-tenant data leakage

Performance:

  • <50ms p95 for semantic search (10K vectors)
  • <100ms p95 for hybrid search (100K vectors)
  • Scales to 10M+ vectors

Embedding Model:

  • OpenAI text-embedding-3-large (3072 → 1536 dimensions)
  • Fallback: all-MiniLM-L6-v2 (384 dimensions, local)

7. Redis Cache & Queue

Technology: Redis 7 (cluster mode for standalone, shared for CODITECT) Purpose: Multi-purpose in-memory data store Deployment:

  • Standalone: Managed Redis (GCP Memorystore, 3 replicas)
  • CODITECT: Shared Redis with namespace prefixes

Use Cases:

Use CaseKey PatternTTLEviction
Session Cachesession:{user_id}24 hoursLRU
API Response Cacheapi:conversations:{org_id}:{page}60 secondsTTL
Rate Limitingratelimit:{ip}:{endpoint}1 minuteTTL
Celery Queuecelery:queue:{priority}N/ANone
Celery Resultscelery:result:{task_id}1 hourTTL

Configuration:

maxmemory 4gb
maxmemory-policy allkeys-lru # Evict least recently used
save 900 1 # Snapshot every 15 min if ≥1 key changed
save 300 10 # Snapshot every 5 min if ≥10 keys changed
appendonly yes # Enable AOF for durability

High Availability:

  • 3-node cluster (1 primary, 2 replicas)
  • Automatic failover (Sentinel)
  • Read replicas for scaling

8. Observability Stack (Monitoring)

Technology: Prometheus + Grafana + Loki + Jaeger (standalone only) Purpose: Metrics, logs, and distributed tracing Deployment: Kubernetes (Helm charts)

Components:

Prometheus (Metrics)

  • Scrapes metrics from API, Celery, PostgreSQL, Redis
  • 15-second scrape interval
  • 30-day retention
  • Alertmanager for notifications

Key Metrics:

# Request metrics
http_requests_total{method, endpoint, status}
http_request_duration_seconds{method, endpoint, quantile}

# Database metrics
postgres_connections_active
postgres_query_duration_seconds{query_type, quantile}

# Celery metrics
celery_tasks_total{task_name, status}
celery_task_duration_seconds{task_name, quantile}

# Business metrics
conversations_created_total{organization_id}
searches_performed_total{search_type}

Grafana (Dashboards)

  • 10 pre-built dashboards
  • API performance, database health, Celery queue depth
  • Real-time alerting (PagerDuty integration)

Loki (Logs)

  • Centralized log aggregation
  • 7-day retention (compressed)
  • Query language: LogQL

Jaeger (Distributed Tracing)

  • End-to-end request tracing
  • Identifies performance bottlenecks
  • 1% sampling rate in production

Communication Patterns

Synchronous Communication (Request-Response)

FromToProtocolPortUse Case
Web AppAPI LayerHTTPS443User actions
API LayerBusiness LogicPython function callN/AService invocation
Business LogicPostgreSQLPostgreSQL Protocol5432Data queries
Business LogicWeaviateGraphQL over HTTP8080Semantic search
Business LogicRedisRedis Protocol6379Cache lookup

Asynchronous Communication (Message Queue)

FromToQueueMessage Type
API LayerCelery WorkersRedis (high_priority)Generate embeddings
GitHubAPI LayerWebhook (POST)Commit pushed
Business LogicCelery WorkersRedis (default)Correlate conversations

Real-Time Communication (WebSocket)

FromToProtocolUse Case
Web AppAPI LayerWebSocket (wss://)Live conversation updates
API LayerWeb AppWebSocket (server push)Notify: new message added

Data Flow Examples

Example 1: Create Conversation

1. User → Web App: Clicks "Save Conversation"
2. Web App → API Layer: POST /conversations {title, messages}
3. API Layer → Business Logic: conversation_service.create()
4. Business Logic → PostgreSQL: INSERT INTO conversations (...)
5. PostgreSQL → Business Logic: Returns conversation ID
6. Business Logic → Celery: Queue "generate_embeddings" job
7. Business Logic → API Layer: Returns conversation object
8. API Layer → Web App: 201 Created {id, title, created_at}
9. Celery Worker → OpenAI API: POST /embeddings {text}
10. OpenAI API → Celery Worker: Returns vector [1536 dimensions]
11. Celery Worker → Weaviate: PUT /Conversation/{id} {vector}
12. Celery Worker → PostgreSQL: UPDATE conversations SET embedding_status = 'completed'

Total Latency: ~150ms (user sees confirmation before embedding completes)

1. User → Web App: Enters "authentication bug" in search
2. Web App → API Layer: GET /conversations/search?q=authentication+bug&alpha=0.5
3. API Layer → Redis: Check cache key "search:org123:authentication+bug:0.5"
4. Redis → API Layer: MISS (no cache)
5. API Layer → Business Logic: search_service.hybrid_search()
6. Business Logic → [PARALLEL]:
a. PostgreSQL: Keyword search (full-text index)
b. Weaviate: Semantic search (vector similarity)
7. Business Logic: RRF fusion (merge results)
8. Business Logic → API Layer: Returns fused results [10 conversations]
9. API Layer → Redis: Cache results (TTL: 60s)
10. API Layer → Web App: 200 OK {items: [...], total: 45}

Total Latency: ~80ms (p50), ~120ms (p95)

Example 3: GitHub Webhook Processing

1. Developer → GitHub: git push origin main
2. GitHub → API Layer: POST /webhooks/github {commits: [...]}
3. API Layer: Verify HMAC signature
4. API Layer → Celery: Queue "process_webhook" job
5. API Layer → GitHub: 200 OK (acknowledge receipt)
6. Celery Worker → Business Logic: webhook_service.process_github_push()
7. Business Logic → PostgreSQL: INSERT INTO commits (...)
8. Business Logic → correlation_service: find_related_conversations()
9. correlation_service → PostgreSQL: Query conversations by timestamp
10. correlation_service → Weaviate: Semantic similarity (commit message vs conversations)
11. correlation_service → PostgreSQL: INSERT INTO conversation_commit_links
12. Business Logic → WebSocket: Notify connected users (new commit linked)

Total Latency: ~500ms (async, user not blocked)


Container Deployment Specifications

Standalone Mode (Kubernetes)

# kubernetes/deployments.yaml

# API Layer
apiVersion: apps/v1
kind: Deployment
metadata:
name: context-api
spec:
replicas: 3
selector:
matchLabels:
app: context-api
template:
metadata:
labels:
app: context-api
spec:
containers:
- name: api
image: gcr.io/coditect/context-api:latest
ports:
- containerPort: 8000
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: postgres-credentials
key: url
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 30
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 5
periodSeconds: 10

# Celery Workers
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: context-celery
spec:
replicas: 5
selector:
matchLabels:
app: context-celery
template:
metadata:
labels:
app: context-celery
spec:
containers:
- name: celery
image: gcr.io/coditect/context-celery:latest
command: ["celery", "-A", "core.celery_app", "worker", "--loglevel=info"]
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"

CODITECT Integration Mode (Cloud Run)

# cloudrun/service.yaml

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: context-intelligence
annotations:
run.googleapis.com/ingress: internal # Internal only
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1"
autoscaling.knative.dev/maxScale: "100"
spec:
containers:
- image: gcr.io/coditect/context-django:latest
ports:
- containerPort: 8080
resources:
limits:
memory: 512Mi
cpu: 1000m
env:
- name: DJANGO_SETTINGS_MODULE
value: "core.settings.coditect"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: postgres-credentials
key: url

Container Resource Requirements

ContainerCPU (Request)CPU (Limit)Memory (Request)Memory (Limit)Replicas/Pods
API Layer500m1000m512Mi1Gi3-10 (HPA)
Business LogicN/AN/AN/AN/AEmbedded in API
Celery Workers250m500m256Mi512Mi5-20 (HPA)
PostgreSQL2000m4000m4Gi8Gi1 primary + 2 replicas
Weaviate1000m2000m2Gi4Gi1 (managed)
Redis500m1000m1Gi2Gi1 primary + 2 replicas
Monitoring1000m2000m2Gi4Gi1 (Prometheus + Grafana)

Total Minimum: ~7 CPUs, ~14GB RAM Total Production: ~20 CPUs, ~35GB RAM


Disaster Recovery & High Availability

Database Backups

DatabaseBackup FrequencyRetentionRecovery Time
PostgreSQLDaily (automated)30 days<1 hour (PITR)
WeaviateWeekly (snapshot)4 weeks<2 hours
RedisHourly (AOF)7 days<15 minutes

Failover Strategies

API Layer:

  • Load balancer health checks (every 10s)
  • Automatic pod replacement (K8s ReplicaSets)
  • Zero-downtime deployments (rolling updates)

Databases:

  • PostgreSQL: Automatic failover to replica (<30s)
  • Redis: Sentinel-managed failover (<10s)
  • Weaviate: Cloud-managed HA (99.9% SLA)

Next Level: Component Diagram (C4 Level 3)

The Container diagram shows how the system is deployed (technology choices, containers, databases). The next level (Component Diagram) will show internal components within each container:

  • API Layer components (endpoints, middleware, authentication)
  • Business Logic components (services, repositories, domain models)
  • Background worker components (task handlers, schedulers)

See: c4-l3-component.md


Diagram Maintained By: Architecture Team Last Updated: 2025-11-26 Review Cycle: Quarterly Related Documents: