Software Design Document: CODITECT Multi-Tenant SaaS Platform
| Metadata | Value |
|---|---|
| Document Version | 1.0.0 |
| Date | December 24, 2025 |
| Status | Draft for Review |
| Authors | CODITECT Architecture Team |
| Approvers | Hal Casteel (CEO/CTO) |
| Related ADRs | ADR-004 (Multi-Tenant Strategy), ADR-005 (Workstation Broker Pattern), ADR-006 (Gitea Multi-Tenant Git), ADR-008 (RBAC), ADR-009 (Multi-Tenant SaaS Architecture) |
Table of Contents
- Executive Summary
- System Context (C4 Level 1)
- Container Architecture (C4 Level 2)
- Component Diagrams (C4 Level 3)
- Data Model Overview
- API Design Summary
- Security Architecture
- Deployment Architecture
- Quality Attributes
- Implementation Roadmap
- Appendices
1. Executive Summary
1.1 Purpose and Scope
This Software Design Document (SDD) defines the comprehensive architecture for the CODITECT Multi-Tenant SaaS Platform - a cloud-native development environment platform that enables:
- User Registration and Authentication via Firebase Identity Platform
- Payment Processing via Stripe with subscription management
- Cloud Workstations with dedicated, shared, and pool modes with auto-hibernation cost optimization
- Self-Hosted Git via Gitea with multi-tenant organization isolation and GitHub mirroring
- Multi-Tenant Isolation using PostgreSQL Row-Level Security and GCP IAM
- Enterprise-Grade Security supporting SOC 2, GDPR, and HIPAA compliance
1.2 System Overview
CODITECT provides fully managed cloud development workstations with IDE integration, enabling developers to:
- Access production-ready development environments instantly
- Collaborate across organizations, teams, and projects
- Scale compute resources based on workload requirements
- Pay only for actual usage with auto-hibernation
Target Market:
- Individual developers (Free/Starter tiers)
- Small businesses (Professional tier)
- Enterprises (Business/Enterprise tiers)
- Contractors with time-limited access
- Third-party auditors with read-only access
1.3 Design Approach and Standards
Architectural Principles:
- Serverless-First: Minimize idle infrastructure costs using GCP Cloud Run
- Multi-Tenant by Default: Strong isolation via PostgreSQL RLS and GCP IAM
- API-First Design: All functionality exposed via RESTful APIs
- Security in Depth: Multiple layers of authentication, authorization, and isolation
- Cost Optimization: Auto-hibernation and usage-based billing
Standards and Compliance:
- C4 Model: Architectural diagrams using Context, Container, Component, Code levels
- OpenAPI 3.0: API specification and documentation
- OAuth 2.0 / OIDC: Authentication and authorization flows
- SOC 2 Type II: Security and availability controls
- GDPR: Data privacy and residency requirements
- PCI DSS: Payment processing (delegated to Stripe)
2. System Context (C4 Level 1)
2.1 Context Diagram
2.2 External Systems
| System | Type | Purpose | Interface |
|---|---|---|---|
| Firebase Auth | Authentication | User identity management, email/password, OAuth providers | REST API |
| Stripe | Payment Processing | Subscription billing, payment methods, invoicing | REST API + Webhooks |
| GCP Cloud Workstations | Development Environment | Managed cloud-based development workstations | gRPC API |
| Gitea | Git Hosting | Self-hosted multi-tenant Git with organization isolation | REST API + Git Protocol |
| GitHub | OAuth + Mirroring | Social login and bidirectional repository mirroring | OAuth 2.0 + Git |
| GitLab | OAuth Provider | Social login integration | OAuth 2.0 |
| GCP Cloud SQL | Database | PostgreSQL with Row-Level Security | SQL Protocol |
| GCP Cloud Storage | Object Storage | User files, backups, artifacts | GCS API |
| GCP Pub/Sub | Message Queue | Async workstation provisioning, events | Pub/Sub API |
| GCP Memorystore | Cache | Redis for sessions, rate limiting | Redis Protocol |
2.3 User Personas
| Persona | Role | Primary Goals | Access Level |
|---|---|---|---|
| Individual Developer | Owner | Quick setup, learning projects | Full (own org) |
| Organization Admin | Admin | Manage team, billing, access control | Full (org scope) |
| Team Lead | Developer | Coordinate projects, review code | Team scope |
| Team Member | Developer | Build features, access workstation | Project scope |
| Contractor | Contractor | Time-limited project access | Scoped access |
| Auditor | Auditor | Read-only compliance review | Read-only |
3. Container Architecture (C4 Level 2)
3.1 Container Diagram
3.2 Container Descriptions
Frontend Containers
| Container | Technology | Purpose | Scalability |
|---|---|---|---|
| Web Application | React, TypeScript, Next.js | Primary user interface with SSR | CDN + Cloud Run (0-100 instances) |
| Mobile App | Flutter | Mobile access (iOS/Android) | Native app distribution |
| CLI Tool | Node.js | Automation and scripting | Client-side execution |
API Layer Containers
| Container | Technology | Purpose | Scalability |
|---|---|---|---|
| API Gateway | Actix-web (Rust), Cloud Run | Request routing, auth, rate limiting | 0-1000 instances |
| Auth Service | Python FastAPI, Cloud Run | Authentication, JWT management | 0-100 instances |
| User Service | Python FastAPI, Cloud Run | Org/team/project CRUD | 0-200 instances |
| Billing Service | Python FastAPI, Cloud Run | Stripe integration, webhooks | 0-50 instances |
| Workstation Service | Python FastAPI, Cloud Run | Shared workstation management, session lifecycle | 0-100 instances |
| Repository Service | Python FastAPI, Cloud Run | Gitea repository management, GitHub mirroring | 0-100 instances |
Data Layer Containers
| Container | Technology | Purpose | Scalability |
|---|---|---|---|
| Cloud SQL | PostgreSQL 15 | Primary relational data store | Vertical scaling (2-96 vCPU) |
| Memorystore | Redis 7 | Session cache, rate limits | Vertical scaling (1-300GB) |
| Cloud Storage | GCS | Object storage for files | Unlimited |
| Pub/Sub | GCP Pub/Sub | Async event queue | Auto-scaling |
4. Component Diagrams (C4 Level 3)
4.1 Auth Service Components
Component Responsibilities:
| Component | Responsibility | Key Methods |
|---|---|---|
| Auth Controller | HTTP request handling | POST /auth/login, POST /auth/logout, POST /auth/refresh |
| Token Manager | JWT token lifecycle | generate_token(), validate_token(), refresh_token() |
| Session Manager | Session state management | create_session(), get_session(), invalidate_session() |
| RBAC Enforcer | Permission validation | check_permission(), has_role(), filter_by_access() |
| Firebase Client | External auth integration | verify_firebase_token(), get_user_info() |
4.2 Billing Service Components
Component Responsibilities:
| Component | Responsibility | Key Methods |
|---|---|---|
| Billing Controller | HTTP request handling | POST /billing/checkout, GET /billing/usage, POST /billing/portal |
| Subscription Manager | Subscription state machine | create_subscription(), update_subscription(), cancel_subscription() |
| Usage Tracker | Metering and quotas | track_workstation_usage(), calculate_monthly_cost(), check_quota() |
| Invoice Generator | Invoice creation | generate_invoice(), send_invoice_email() |
| Stripe Client | External payment integration | create_checkout_session(), create_portal_session() |
| Webhook Handler | Event processing | handle_checkout_completed(), handle_payment_failed() |
4.2.1 Commerce Service Components (Product Catalog & Shopping Cart)
Component Responsibilities:
| Component | Responsibility | Key Methods |
|---|---|---|
| Catalog Controller | HTTP request handling | GET /products, POST /cart/items, POST /checkout |
| Cart Manager | Shopping cart state | add_item(), remove_item(), get_cart(), clear_cart() |
| Checkout Orchestrator | Multi-product checkout | create_order(), process_payment(), fulfill_order() |
| Product Catalog | Product definitions | get_products(), get_product_by_slug(), check_eligibility() |
| Google Pay Client | Google Pay integration | create_payment_request(), process_payment_data() |
| Entitlement Manager | Access provisioning | grant_entitlement(), check_entitlement(), revoke_entitlement() |
CODITECT Product Catalog
| Product | Slug | Type | Price (Monthly) | Requires | Provisions |
|---|---|---|---|---|---|
| CODITECT Core | core | base | $49/mo | - | GCP Workstation + coditect-core framework |
| CODITECT DMS | dms | addon | $29/mo | core | Access to dms.coditect.ai |
| Workflow Analyzer | workflow | addon | $19/mo | core | Access to workflow.coditect.ai |
| Enterprise Bundle | enterprise | bundle | $149/mo | - | All products + priority support |
Checkout Flow Sequence
Payment Methods
| Method | Integration | Use Case |
|---|---|---|
| Stripe Checkout | Redirect flow | Full-featured checkout, international |
| Google Pay | In-app payment | One-tap mobile/web checkout |
| Apple Pay | Coming soon | iOS/Safari users |
4.2.2 Context Sync Service Components (Multi-Level Context Sync)
Architecture Reference: ADR-044 (Custom REST Sync), ADR-045 (Team/Project Context Sync)
Context Sync API Endpoints
| Endpoint | Method | Purpose | Auth |
|---|---|---|---|
/api/v1/context/push | POST | Push user context from device | JWT (user) |
/api/v1/context/pull | GET | Pull user context to device | JWT (user) |
/api/v1/context/status | GET | User sync status | JWT (user) |
/api/v1/teams/{id}/context/push | POST | Push team context | JWT (team member) |
/api/v1/teams/{id}/context/pull | GET | Pull team context | JWT (team member) |
/api/v1/teams/{id}/context/backup | POST | Create team backup | JWT (team admin) |
/api/v1/teams/{id}/context/restore | POST | Restore team backup | JWT (team admin) |
/api/v1/projects/{id}/context/push | POST | Push project context | JWT (project member) |
/api/v1/projects/{id}/context/pull | GET | Pull project context | JWT (project member) |
/api/v1/projects/{id}/context/backup | POST | Create project backup | JWT (project owner) |
/api/v1/projects/{id}/context/restore | POST | Restore project backup | JWT (project owner) |
Sync Architecture
User (Device A) User (Device B) Team Member
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────┐
│ CODITECT Sync API │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────┐ │
│ │ User Context│ │Team Context │ │ Project │ │
│ │ (per-user) │ │ (per-team) │ │ Context │ │
│ └─────────────┘ └─────────────┘ └─────────┘ │
└─────────────────────────────────────────────────┘
│
▼
PostgreSQL
(Cloud SQL)
│
▼
GCS Backups
├── users/{user_id}/
├── teams/{team_id}/
└── projects/{project_id}/
Backup Hierarchy
| Level | Scope | Backup Schedule | Retention |
|---|---|---|---|
| User | Personal context per device | Hourly incremental, Daily full | 7d / 90d |
| Team | Shared team knowledge | Hourly incremental, Daily full | 7d / 90d |
| Project | Project-scoped insights | Hourly incremental, Daily full | 7d / 90d |
| Organization | All team + project contexts | Weekly full | 365d |
4.3 Workstation Service Components
Component Responsibilities:
| Component | Responsibility | Key Methods |
|---|---|---|
| Workstation Controller | HTTP request handling | GET /me/available, POST /{id}/connect, POST /sessions/{id}/disconnect |
| Session Manager | Multi-user session lifecycle | create_session(), disconnect_session(), get_active_sessions() |
| Access Manager | Workstation access control | get_available_workstations(), set_default_workstation(), grant_access() |
| Workstation Provisioner | Async workstation creation | provision_workstation(), configure_environment() |
| Lifecycle Manager | State transitions | start_workstation(), stop_workstation(), delete_workstation() |
| Hibernation Scheduler | Cost optimization | check_idle_workstations(), schedule_hibernation() |
| Usage Monitor | Metering and reporting | track_runtime(), record_activity(), calculate_monthly_usage() |
| GCP Workstations Client | External API integration | create_workstation(), start_workstation(), get_workstation_status() |
Shared Workstation Modes
| Mode | Description | Max Users | Use Case |
|---|---|---|---|
| dedicated | One workstation per user | 1 | High isolation, premium tier |
| shared | Multiple users on one workstation | 5 | Cost-conscious teams |
| pool | On-demand assignment from pool | N/A | Large teams, variable demand |
User Isolation in Shared Workstations
- Each user gets a unique Linux account (UID 1000+)
- Separate home directories with permission isolation
- SSH keys managed per-user
- Concurrent sessions tracked independently
4.4 Repository Service Components
Component Responsibilities:
| Component | Responsibility | Key Methods |
|---|---|---|
| Repository Controller | HTTP request handling | POST /repositories, GET /repositories/{id}, DELETE /repositories/{id} |
| Gitea Client | Gitea API integration | create_org_repo(), add_collaborator(), get_repo_info() |
| Mirror Manager | GitHub mirroring | setup_mirror(), sync_to_github(), sync_from_github() |
| Access Manager | Permission control | check_repo_access(), grant_access(), revoke_access() |
| Webhook Handler | Event processing | handle_push(), handle_pr(), handle_mirror_sync() |
Gitea Multi-Tenant Isolation
- Each organization maps to a Gitea organization
- Repositories created under
gitea.coditect.io/{org}/{repo} - Organization admins have owner access in Gitea
- Team-based repository access mirrors CODITECT RBAC
- See ADR-006 for complete Gitea architecture
5. Data Model Overview
5.1 Entity-Relationship Diagram
5.2 Multi-Tenant Hierarchy
Organization (Tenant Root)
├── Subscription (Billing)
│ └── Invoices (Payment History)
├── Members (Users with Roles)
│ ├── Owner (1 required)
│ ├── Admin (0-N)
│ ├── Developer (0-N)
│ ├── Contractor (time-limited)
│ └── Auditor (read-only)
├── Teams (Logical Groups)
│ ├── Team Members
│ └── Projects
│ └── Repositories
└── Workstations (Per-User)
├── Running (billable)
├── Hibernated (no charge)
└── Stopped (no charge)
5.3 Row-Level Security (RLS) Policies
| Table | Policy | Rule |
|---|---|---|
| organizations | SELECT | User is an active member |
| organizations | UPDATE | User has admin or owner role |
| teams | SELECT | User is member of organization |
| teams | INSERT/UPDATE | User has admin or owner role |
| projects | SELECT | User is member of organization |
| projects | INSERT/UPDATE | User has developer role or higher |
| workstations | SELECT | User owns workstation OR user is org admin |
| workstations | UPDATE | User owns workstation OR user is org admin |
| subscriptions | SELECT | User is org member |
| subscriptions | UPDATE | User has owner role |
Security Functions:
current_org_id()- Returns organization ID from session variableuser_has_org_access(org_id, min_role)- Checks if user has required role in organization
6. API Design Summary
6.1 RESTful API Endpoints
Authentication Endpoints
| Endpoint | Method | Description | Auth Required |
|---|---|---|---|
/api/v1/auth/login | POST | Login with email/password | No |
/api/v1/auth/login/google | POST | Login with Google OAuth | No |
/api/v1/auth/login/github | POST | Login with GitHub OAuth | No |
/api/v1/auth/logout | POST | Invalidate session | Yes |
/api/v1/auth/refresh | POST | Refresh JWT token | Yes |
/api/v1/auth/verify | GET | Verify current token | Yes |
Organization Endpoints
| Endpoint | Method | Description | Required Role |
|---|---|---|---|
/api/v1/organizations | POST | Create organization | Authenticated User |
/api/v1/organizations/{org_id} | GET | Get organization details | Member |
/api/v1/organizations/{org_id} | PATCH | Update organization | Admin |
/api/v1/organizations/{org_id}/members | GET | List members | Member |
/api/v1/organizations/{org_id}/members | POST | Invite member | Admin |
/api/v1/organizations/{org_id}/members/{user_id} | DELETE | Remove member | Admin |
Team Endpoints
| Endpoint | Method | Description | Required Role |
|---|---|---|---|
/api/v1/organizations/{org_id}/teams | GET | List teams | Member |
/api/v1/organizations/{org_id}/teams | POST | Create team | Developer |
/api/v1/organizations/{org_id}/teams/{team_id} | GET | Get team details | Member |
/api/v1/organizations/{org_id}/teams/{team_id} | PATCH | Update team | Admin |
/api/v1/organizations/{org_id}/teams/{team_id}/members | POST | Add team member | Admin |
Project Endpoints
| Endpoint | Method | Description | Required Role |
|---|---|---|---|
/api/v1/organizations/{org_id}/projects | GET | List projects | Member |
/api/v1/organizations/{org_id}/projects | POST | Create project | Developer |
/api/v1/organizations/{org_id}/projects/{project_id} | GET | Get project details | Member |
/api/v1/organizations/{org_id}/projects/{project_id} | PATCH | Update project | Developer |
/api/v1/organizations/{org_id}/projects/{project_id} | DELETE | Archive project | Admin |
Workstation Endpoints
| Endpoint | Method | Description | Required Role |
|---|---|---|---|
/api/v1/workstations/me/available | GET | List available workstations for current user | Member |
/api/v1/workstations/me/default | POST | Set default workstation for login | Member |
/api/v1/workstations/{ws_id}/connect | POST | Connect to workstation (creates session) | Member |
/api/v1/workstations/sessions | GET | List active sessions for current user | Member |
/api/v1/workstations/sessions/{session_id}/disconnect | POST | Disconnect from workstation | Member |
/api/v1/workstations/{ws_id} | GET | Get workstation details | Owner or Admin |
/api/v1/workstations/{ws_id}/start | POST | Start workstation | Owner or Admin |
/api/v1/workstations/{ws_id}/stop | POST | Stop workstation | Owner or Admin |
/api/v1/workstations/{ws_id}/status | GET | Get workstation status | Owner or Admin |
/api/v1/workstations/{ws_id}/usage | GET | Get usage statistics | Owner or Admin |
/api/v1/workstations | POST | Provision new shared workstation | Admin |
Repository Endpoints
| Endpoint | Method | Description | Required Role |
|---|---|---|---|
/api/v1/repositories | GET | List organization repositories | Member |
/api/v1/repositories | POST | Create new repository in Gitea | Developer |
/api/v1/repositories/{repo_id} | GET | Get repository details | Member |
/api/v1/repositories/{repo_id} | DELETE | Delete repository | Admin |
/api/v1/repositories/{repo_id}/mirror | POST | Setup GitHub mirror | Developer |
/api/v1/repositories/{repo_id}/mirror/sync | POST | Trigger manual sync | Developer |
/api/v1/repositories/webhooks/gitea | POST | Gitea webhook handler | N/A (Webhook) |
/api/v1/repositories/webhooks/github | POST | GitHub webhook handler | N/A (Webhook) |
Billing Endpoints
| Endpoint | Method | Description | Required Role |
|---|---|---|---|
/api/v1/billing/checkout | POST | Create Stripe checkout session | Owner |
/api/v1/billing/portal | POST | Create customer portal session | Owner |
/api/v1/billing/usage | GET | Get current billing period usage | Owner |
/api/v1/billing/invoices | GET | List invoices | Owner |
/api/v1/billing/webhooks/stripe | POST | Stripe webhook handler | N/A (Webhook) |
6.2 API Authentication Flow
6.3 Workstation Provisioning Flow
7. Security Architecture
7.1 Authentication and Authorization
Authentication Methods:
- Email/Password - Firebase Auth with email verification
- Google OAuth - Social login via Firebase
- GitHub OAuth - Developer-focused social login
- GitLab OAuth - Enterprise developer login
Authorization Model:
- Role-Based Access Control (RBAC) with hierarchical roles
- PostgreSQL Row-Level Security for data isolation
- JWT Tokens with embedded permissions
- Session Management via Redis with TTL
Role Hierarchy:
Owner > Admin > Developer > Contractor > Auditor > Viewer
7.2 RBAC Permission Matrix
| Resource | Owner | Admin | Developer | Contractor | Auditor | Viewer |
|---|---|---|---|---|---|---|
| View org settings | Yes | Yes | No | No | Yes | No |
| Modify org settings | Yes | Yes | No | No | No | No |
| Manage billing | Yes | Yes | No | No | No | No |
| Invite members | Yes | Yes | No | No | No | No |
| Remove members | Yes | Yes | No | No | No | No |
| Create teams | Yes | Yes | Yes | No | No | No |
| Create projects | Yes | Yes | Yes | Scoped | No | No |
| View projects | Yes | Yes | Yes | Scoped | Yes | Yes |
| Manage own workstation | Yes | Yes | Yes | Yes | No | No |
| View all workstations | Yes | Yes | No | No | No | No |
| View audit logs | Yes | Yes | No | No | Yes | No |
| Export data | Yes | Yes | No | No | Yes | No |
7.3 Multi-Tenant Isolation Layers
7.4 Security Controls
| Control Type | Implementation | Purpose |
|---|---|---|
| Authentication | Firebase Auth + JWT | Verify user identity |
| Authorization | RBAC + PostgreSQL RLS | Enforce access control |
| Data Encryption (Rest) | GCP-managed AES-256 | Protect stored data |
| Data Encryption (Transit) | TLS 1.3 | Protect data in motion |
| Network Isolation | VPC, Private IPs | Segment infrastructure |
| Rate Limiting | Redis + API Gateway | Prevent abuse |
| Audit Logging | PostgreSQL + Cloud Logging | Compliance and forensics |
| Secret Management | GCP Secret Manager | Protect credentials |
| IAM | GCP IAM + Service Accounts | Least privilege access |
| Vulnerability Scanning | Container Analysis | Detect CVEs |
7.5 Compliance Mapping
| Standard | Status | Key Controls |
|---|---|---|
| SOC 2 Type II | Ready | GCP inherited controls, audit logging, access controls |
| GDPR | Compliant | Data residency (EU region), right to erasure, data portability |
| HIPAA | Ready | BAA available, encryption at rest/transit, audit logging |
| PCI DSS | Delegated | Stripe handles all card data (no card data in CODITECT) |
8. Deployment Architecture
8.1 GCP Infrastructure Topology
8.2 Cloud Run Configuration
| Service | Min Instances | Max Instances | CPU | Memory | Timeout |
|---|---|---|---|---|---|
| API Gateway | 1 | 1000 | 2 vCPU | 4 GiB | 60s |
| Auth Service | 1 | 100 | 1 vCPU | 2 GiB | 30s |
| User Service | 1 | 200 | 1 vCPU | 2 GiB | 30s |
| Billing Service | 0 | 50 | 1 vCPU | 2 GiB | 60s |
| Workstation Service | 1 | 100 | 2 vCPU | 4 GiB | 300s |
Auto-scaling Triggers:
- Request rate > 80 requests/instance
- CPU utilization > 70%
- Memory utilization > 80%
8.3 Database Configuration
Cloud SQL (PostgreSQL 15):
- Instance:
db-n1-standard-4(4 vCPU, 15 GB RAM) - Storage: 100 GB SSD (auto-resize enabled)
- High Availability: Regional (automatic failover)
- Backup: Daily automated backups (7-day retention)
- Point-in-time recovery: 7 days
Memorystore (Redis 7):
- Tier: Standard (HA with automatic failover)
- Memory: 5 GB (expandable to 300 GB)
- Persistence: RDB snapshots every 6 hours
8.4 Disaster Recovery
| Scenario | RTO | RPO | Recovery Strategy |
|---|---|---|---|
| Cloud SQL Failure | 2 minutes | 0 seconds | Automatic failover to standby replica |
| Region Outage | 4 hours | 5 minutes | Manual failover to backup region |
| Data Corruption | 1 hour | 1 hour | Point-in-time recovery from backups |
| Accidental Deletion | 30 minutes | 24 hours | Restore from daily backup |
Backup Strategy:
- Automated SQL Backups: Daily at 2 AM UTC, 7-day retention
- Transaction Logs: Continuous archival for PITR
- GCS Snapshots: Weekly full snapshots, 30-day retention
- Disaster Recovery Region:
us-east1(passive standby)
9. Quality Attributes
9.1 Scalability
Horizontal Scaling:
- API Services: Cloud Run auto-scales 0 → 1000 instances per service
- Database Connections: PgBouncer connection pooling (1000 max connections)
- Workstation Cluster: Supports 10,000+ concurrent workstations per region
Vertical Scaling:
- Cloud SQL: Upgrade to
db-n1-highmem-16(16 vCPU, 104 GB RAM) - Memorystore: Expand to 300 GB memory
- Workstation Machine Types: e2-medium → n2-highmem-16
Performance Targets:
| Metric | Target | Current |
|---|---|---|
| API Response Time (p95) | < 200ms | 150ms |
| Workstation Provisioning | < 5 minutes | 3 minutes |
| Database Query Time (p95) | < 50ms | 30ms |
| Concurrent Users | 10,000+ | Tested to 1,000 |
9.2 Reliability
Availability Targets:
- SLA: 99.9% uptime (8.76 hours downtime/year)
- API Services: 99.95% (leveraging Cloud Run SLA)
- Cloud SQL: 99.95% (HA configuration)
- Workstations: 99.5% (dependent on GCP Cloud Workstations SLA)
Fault Tolerance:
- Automatic Retries: Exponential backoff for transient failures
- Circuit Breaker: Prevent cascade failures (open after 5 consecutive failures)
- Health Checks:
/healthendpoint on all services (10s interval) - Graceful Degradation: Read-only mode when database replica fails
Monitoring and Alerting:
- Uptime Checks: External synthetic monitoring (1-minute interval)
- Error Rate Alerts: > 5% error rate triggers PagerDuty
- Latency Alerts: p95 > 500ms triggers warning
- Database Alerts: Connection pool > 80% triggers scaling
9.3 Security
Security Posture:
- Zero Trust Architecture: All services require authentication
- Least Privilege: Service accounts with minimal permissions
- Defense in Depth: Multiple layers of isolation (app, DB, infra)
- Audit Trail: All mutations logged to Cloud Logging
Vulnerability Management:
- Container Scanning: Automated CVE detection on every build
- Dependency Scanning: Snyk integration for npm/pip packages
- SAST: Static analysis with Semgrep
- Penetration Testing: Annual third-party security audit
Incident Response:
- Detection: Real-time security alerts via Cloud Security Command Center
- Containment: Automated IP blocking for detected attacks
- Recovery: Rollback to last known good deployment (< 5 minutes)
- Post-Mortem: Required for all security incidents
9.4 Maintainability
Code Quality:
- Test Coverage: > 80% unit test coverage
- E2E Tests: Critical user flows automated with Playwright
- Code Review: Required for all changes (2 approvals)
- Linting: Automated with Ruff (Python), ESLint (TypeScript)
Observability:
- Structured Logging: JSON logs with trace context
- Distributed Tracing: OpenTelemetry + Cloud Trace
- Metrics: Prometheus-compatible metrics exposed on
/metrics - Dashboards: Grafana dashboards for all services
Documentation:
- API Documentation: Auto-generated OpenAPI spec
- Architecture Diagrams: C4 diagrams (this document)
- Runbooks: Operational procedures for common incidents
- ADRs: Architecture decision records for major choices
9.5 Cost Optimization
Strategies:
- Serverless-First: Pay only for actual compute usage
- Auto-Hibernation: Workstations hibernate after 30-60 minutes idle (60-70% savings)
- Connection Pooling: Reduce database connections (PgBouncer)
- Caching: Redis for session and query caching (reduce DB load)
- CDN: Cloud CDN for static assets (reduce egress costs)
Cost Breakdown (1,000 users):
| Component | Monthly Cost | % of Total |
|---|---|---|
| Cloud Workstations | $10,000 | 91% |
| Cloud SQL | $200 | 2% |
| Cloud Run | $150 | 1% |
| Memorystore | $200 | 2% |
| Cloud Storage | $100 | 1% |
| Other (CDN, Pub/Sub, Monitoring) | $300 | 3% |
| Total | $10,950 | 100% |
Revenue Model:
- Gross Margin: 45% overall
- Per-User Margin (Business Plan): 86% ($99/user - $13.99 cost)
10. Implementation Roadmap
10.1 Phase 1: Foundation (Weeks 1-4)
Milestone: Core authentication, database, and API scaffolding
Tasks:
- Week 1: PostgreSQL schema deployment with RLS policies
- Week 1: Firebase Auth integration (email/password, Google OAuth)
- Week 2: Cloud Run API Gateway with Actix-web
- Week 2: Auth Service implementation (JWT, session management)
- Week 3: User Service implementation (org/team/project CRUD)
- Week 3: Stripe integration (checkout, webhooks, customer portal)
- Week 4: Integration testing and security hardening
Deliverables:
- Users can register, login, create organizations
- Stripe checkout flow working end-to-end
- API documentation (OpenAPI spec)
- CI/CD pipeline for Cloud Run deployments
10.2 Phase 2: Workstations (Weeks 5-8)
Milestone: Cloud Workstation provisioning and lifecycle management
Tasks:
- Week 5: GCP Cloud Workstations cluster setup (us-central1)
- Week 5: Workstation configs (Starter, Pro, Max, Enterprise tiers)
- Week 6: Workstation Service implementation (provisioning API)
- Week 6: Pub/Sub async provisioning queue
- Week 7: Workstation Controller implementation (start/stop/hibernate)
- Week 7: Auto-hibernation scheduler (idle detection)
- Week 8: Usage tracking and metering integration
- Week 8: End-to-end workstation flow testing
Deliverables:
- Users can provision and access cloud workstations
- Auto-hibernation working (saves 60-70% cost)
- Usage tracking integrated with billing
- Workstation lifecycle fully automated
10.3 Phase 3: Multi-Tenancy (Weeks 9-12)
Milestone: Full multi-tenant organization management
Tasks:
- Week 9: Organization management UI (settings, branding)
- Week 9: Team CRUD implementation
- Week 10: Project CRUD implementation
- Week 10: RBAC enforcement across all endpoints
- Week 11: Member invitation flow (email invites)
- Week 11: Contractor role with time-limited access
- Week 12: Auditor role with read-only access
- Week 12: Multi-tenant integration testing
Deliverables:
- Organizations can create teams and projects
- Admins can invite members with different roles
- Contractors get auto-revoked access after expiry
- Auditors have read-only compliance access
10.4 Phase 4: Polish (Weeks 13-16)
Milestone: Production-ready platform with full observability
Tasks:
- Week 13: Stripe customer portal integration
- Week 13: Usage dashboards (Grafana)
- Week 14: Audit logging implementation
- Week 14: Admin portal for organization management
- Week 15: Production hardening (rate limiting, DDoS protection)
- Week 15: Security audit and penetration testing
- Week 16: Performance optimization (caching, query tuning)
- Week 16: Beta user testing and feedback iteration
Deliverables:
- Customers can self-manage billing via Stripe portal
- Admins have visibility into usage and costs
- All mutations audited and logged
- Production environment hardened and tested
- Beta launch ready (limited user onboarding)
Target Launch: January 21, 2026 (16 weeks from kickoff)
11. Appendices
11.1 Glossary
| Term | Definition |
|---|---|
| ADR | Architecture Decision Record - documents significant architectural decisions |
| C4 Model | Context, Container, Component, Code - hierarchical architecture diagramming method |
| Cloud Run | GCP serverless container platform with auto-scaling |
| Cloud Workstations | GCP managed cloud development environments with IDE integration |
| Firebase Auth | Google's authentication service supporting email/password and OAuth providers |
| gRPC | High-performance RPC framework using HTTP/2 and Protocol Buffers |
| IAM | Identity and Access Management - GCP service for permissions |
| JWT | JSON Web Token - compact token format for authentication |
| Memorystore | GCP managed Redis service for caching |
| Multi-Tenancy | Architecture pattern where single application serves multiple isolated customers |
| RBAC | Role-Based Access Control - permission model based on user roles |
| RLS | Row-Level Security - PostgreSQL feature for data isolation |
| SSR | Server-Side Rendering - rendering HTML on server for better SEO |
| Stripe | Payment processing platform for subscription billing |
| VPC | Virtual Private Cloud - isolated network in GCP |
11.2 Abbreviations
| Abbreviation | Full Form |
|---|---|
| API | Application Programming Interface |
| CDN | Content Delivery Network |
| CRUD | Create, Read, Update, Delete |
| DB | Database |
| E2E | End-to-End |
| GCP | Google Cloud Platform |
| GCS | Google Cloud Storage |
| HA | High Availability |
| HTTPS | Hypertext Transfer Protocol Secure |
| IDE | Integrated Development Environment |
| JSON | JavaScript Object Notation |
| OIDC | OpenID Connect |
| OAuth | Open Authorization |
| PII | Personally Identifiable Information |
| REST | Representational State Transfer |
| RPO | Recovery Point Objective |
| RTO | Recovery Time Objective |
| SaaS | Software as a Service |
| SLA | Service Level Agreement |
| SQL | Structured Query Language |
| TLS | Transport Layer Security |
| UUID | Universally Unique Identifier |
| vCPU | Virtual CPU |
11.3 References
- ADR-004: Multi-Tenant Strategy
- ADR-005: Workstation Broker Pattern - Shared workstation architecture
- ADR-006: Gitea Multi-Tenant Git - Self-hosted Git with GitHub mirroring
- ADR-008: Role-Based Access Control
- ADR-009: Multi-Tenant SaaS Architecture
- C4 Model: https://c4model.com/
- GCP Cloud Workstations: https://cloud.google.com/workstations/docs
- Gitea API: https://docs.gitea.io/en-us/api-usage/
- Stripe Subscriptions: https://stripe.com/docs/billing/subscriptions/overview
- PostgreSQL RLS: https://www.postgresql.org/docs/current/ddl-rowsecurity.html
- Firebase Auth: https://firebase.google.com/docs/auth
- Cloud Run: https://cloud.google.com/run/docs
- OpenAPI Specification: https://spec.openapis.org/oas/v3.0.0
11.4 Revision History
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0.0 | 2025-12-24 | CODITECT Architecture Team | Initial version with C4 diagrams and complete architecture |
Document Status: Draft for Review Next Review Date: January 7, 2026 Approval Required From: Hal Casteel (CEO/CTO), Architecture Team, Security Team
END OF DOCUMENT