C4 Architecture Diagram - Level 1: System Context
CODITECT Context Intelligence Platform
Diagram Level: 1 (System Context) Abstraction: Highest level - shows the system in its environment Audience: All stakeholders (technical and non-technical) Purpose: Understand who uses the system and what external systems it integrates with
System Context Overview
The Context Intelligence Platform sits at the center of a developer's workflow, connecting AI coding assistants, version control systems, and team collaboration tools. It serves three primary user personas and integrates with six external systems.
C4 Level 1 Diagram (Mermaid)
System Context Description
Central System
Context Intelligence Platform (this system):
- Purpose: Store, search, and analyze AI coding assistant conversations with git commit correlation
- Technology: Python-based web application (FastAPI standalone / Django integrated)
- Deployment: Kubernetes (standalone) or GCP Cloud Run (CODITECT integration)
- Scale: Supports 10,000+ organizations, 100,000+ users, 50M+ messages
User Personas
1. Software Developer (Primary User)
Role: Individual contributor writing code with AI assistance
Use Cases:
- Save AI conversations from Claude Code, GitHub Copilot, Cursor
- Search past conversations by keyword or semantic meaning
- View which conversations led to specific commits
- Export conversation history for compliance
Access Pattern:
- Daily usage (10-50 conversations/day)
- Search 5-10 times/day
- Browser + IDE integrations
Tier: Starter (free), Pro ($15/month), Enterprise (custom)
2. Engineering Manager (Secondary User)
Role: Team lead managing 5-20 developers
Use Cases:
- Review team conversation patterns
- Analyze productivity metrics (conversations → commits ratio)
- Identify knowledge gaps (frequent questions on same topics)
- Generate weekly team reports
Access Pattern:
- Weekly reviews (1-2 hours)
- Dashboard analytics
- Export team reports
Tier: Pro or Enterprise
3. CTO / VP Engineering (Tertiary User)
Role: Executive overseeing multiple teams (20-500+ developers)
Use Cases:
- Org-wide productivity analysis
- Compliance and audit reports (SOC 2, GDPR)
- ROI analysis on AI tooling
- Strategic planning (AI adoption trends)
Access Pattern:
- Monthly reviews
- Executive dashboards
- Compliance exports
Tier: Enterprise only
External Systems
1. GitHub (Code Repository)
Integration Type: Webhooks + REST API
Data Flow:
- Inbound: Push event webhooks when commits are made
- Outbound: Fetch repository metadata, commit details, diffs
Use Cases:
- Automatically link commits to conversations based on timing/content
- Display commit history alongside conversations
- Generate contribution graphs
Configuration:
- Webhook URL:
https://api.context-intelligence.com/webhooks/github - Authentication: GitHub App installation token
- Permissions: Read repository content, webhooks
2. GitLab (Code Repository)
Integration Type: Webhooks + REST API
Data Flow:
- Inbound: Push event webhooks
- Outbound: Fetch commit metadata
Use Cases: Same as GitHub
Configuration:
- Webhook URL:
https://api.context-intelligence.com/webhooks/gitlab - Authentication: Personal access token or OAuth
- Permissions: Read repository, receive webhooks
3. OpenAI API (Embedding Service)
Integration Type: REST API (one-way outbound)
Data Flow:
- Outbound: Send conversation text for embedding generation
- Inbound: Receive 1536-dimensional vectors
Use Cases:
- Generate semantic embeddings for conversations
- Enable semantic search ("find discussions about authentication" matches "JWT security issues")
Configuration:
- Model:
text-embedding-3-large(3072 dimensions, reduced to 1536) - Rate limits: 1M tokens/minute
- Cost: $0.00013 per 1K tokens
Fallback: Local embedding models (sentence-transformers) for on-prem deployments
4. OAuth Providers (Authentication)
Integration Type: OAuth 2.0 authorization code flow
Providers Supported:
- Google Workspace: SSO for enterprises
- Microsoft Azure AD: SSO for enterprises
- GitHub: Developer authentication
Data Flow:
- Outbound: Redirect user to provider for authentication
- Inbound: Receive authorization code, exchange for access token
User Data Retrieved:
- Email address (primary identifier)
- Full name
- Profile photo
- Organization domain (for workspace mapping)
5. CODITECT Platform (Parent Platform)
Integration Type: Internal Django application (integration mode only)
Data Flow:
- Shared: User sessions, authentication, organization models
- Inbound: User context (current org, permissions)
- Outbound: Conversation data for unified dashboard
Use Cases:
- Seamless navigation between CODITECT modules
- Unified billing and user management
- Shared analytics dashboard
Configuration:
- Deployment: Same Kubernetes cluster / Cloud Run project
- Database: Shared PostgreSQL database (separate schema)
- Authentication: Django session middleware
6. Payment Provider (Stripe)
Integration Type: REST API + webhooks (standalone mode only)
Data Flow:
- Outbound: Create checkout sessions, manage subscriptions
- Inbound: Subscription lifecycle webhooks (created, updated, canceled)
Use Cases:
- Process subscription payments
- Enforce tier-based quotas
- Handle upgrades/downgrades
Configuration:
- Webhook URL:
https://api.context-intelligence.com/webhooks/stripe - Products: Starter (free), Pro ($15/month), Enterprise (custom)
System Boundaries
In Scope (This System)
✅ AI conversation storage and retrieval ✅ Semantic and keyword search ✅ Conversation-commit correlation ✅ Team analytics and insights ✅ Multi-tenant data isolation ✅ Tier-based feature gating
Out of Scope (External Systems)
❌ Git repository hosting (GitHub/GitLab) ❌ AI model inference (OpenAI) ❌ User identity management (OAuth providers) ❌ Payment processing (Stripe) ❌ IDE integrations (separate plugins)
Communication Protocols
| Integration | Protocol | Port | Authentication | Encryption |
|---|---|---|---|---|
| Users → Platform | HTTPS | 443 | JWT or Django Session | TLS 1.3 |
| GitHub → Platform | HTTPS (webhook) | 443 | HMAC signature | TLS 1.3 |
| GitLab → Platform | HTTPS (webhook) | 443 | Secret token | TLS 1.3 |
| Platform → OpenAI | HTTPS | 443 | API Key | TLS 1.3 |
| Platform → OAuth | HTTPS | 443 | OAuth 2.0 | TLS 1.3 |
| CODITECT ↔ Platform | Internal | N/A | Shared session | Internal network |
| Stripe → Platform | HTTPS (webhook) | 443 | Webhook signature | TLS 1.3 |
Deployment Modes
Standalone Mode (SaaS)
┌─────────────────────────────────────────────────────────────┐
│ Internet │
│ │
│ Users → CloudFlare CDN → Load Balancer → Kubernetes Cluster│
│ │
│ External Systems → API Gateway → Platform │
└─────────────────────────────────────────────────────────────┘
Infrastructure:
- Kubernetes cluster (3 nodes minimum)
- PostgreSQL (managed, HA)
- Weaviate Cloud (managed)
- Redis (managed, HA)
CODITECT Integration Mode
┌─────────────────────────────────────────────────────────────┐
│ CODITECT Platform (Django) │
│ │
│ Users → CODITECT Web → Context Intelligence Module │
│ │
│ External Systems → Shared API Gateway → Platform │
└─────────────────────────────────────────────────────────────┘
Infrastructure:
- GCP Cloud Run (serverless containers)
- Shared PostgreSQL database
- Weaviate Cloud (dedicated namespace)
- Shared Redis instance
Security Boundaries
Trust Zones
Zone 1: Public Internet (Untrusted)
- User browsers
- GitHub/GitLab webhook servers
- OAuth provider redirects
Zone 2: Application Layer (Authenticated)
- API endpoints (after authentication)
- WebSocket connections
Zone 3: Internal Services (Trusted)
- Database connections
- Internal service mesh (CODITECT mode)
- Background job workers
Zone 4: Secrets (Highly Restricted)
- OpenAI API keys
- Database credentials
- Encryption keys
- OAuth client secrets
Scalability & Reliability
Expected Load
| Metric | Starter Tier | Pro Tier | Enterprise Tier |
|---|---|---|---|
| Orgs | 1 | 1 | 1-1000 |
| Users/Org | 1-5 | 1-50 | 50-5000 |
| Conversations/User/Day | 10 | 30 | 50 |
| Searches/User/Day | 5 | 20 | 50 |
| Webhooks/Day | 50 | 500 | 5000 |
Reliability Targets
| Metric | Target | Measurement |
|---|---|---|
| Uptime | 99.9% | Monthly (43 min downtime/month) |
| API Latency (p95) | <100ms | Continuous monitoring |
| Search Latency (p95) | <100ms | Continuous monitoring |
| Data Durability | 99.999999% | PostgreSQL + daily backups |
| RTO (Recovery Time) | <1 hour | Disaster recovery drills |
| RPO (Recovery Point) | <15 minutes | Continuous replication |
Compliance & Data Residency
Regulatory Compliance
- GDPR (Europe): Data export, right to be forgotten
- CCPA (California): Data disclosure, opt-out
- SOC 2 Type II: Security, availability, confidentiality
- ISO 27001: Information security management
Data Residency Options
- US East (default): us-east1 (South Carolina)
- US West: us-west1 (Oregon)
- Europe: europe-west1 (Belgium)
- Asia Pacific: asia-southeast1 (Singapore)
Configuration: Per-organization setting (Enterprise tier only)
Success Metrics (System Context Level)
| Metric | Target | Current | Trend |
|---|---|---|---|
| Monthly Active Users | 10,000 | TBD | - |
| Organizations | 1,000 | TBD | - |
| Conversations Stored | 10M | TBD | - |
| Searches/Day | 100K | TBD | - |
| API Uptime | 99.9% | TBD | - |
| Customer Satisfaction (CSAT) | >4.5/5 | TBD | - |
Next Level: Container Diagram (C4 Level 2)
The System Context diagram shows what the system does and who uses it. The next level (Container Diagram) will show how the system is structured internally with:
- API Layer (FastAPI or Django REST Framework)
- Business Logic (Python services)
- Databases (PostgreSQL, Weaviate, Redis)
- Background workers (Celery)
- Web frontend (React)
See: c4-l2-container.md
Diagram Maintained By: Architecture Team Last Updated: 2025-11-26 Review Cycle: Quarterly Related Documents: