Skip to main content

πŸ“š Coditect Sandbox Platform β€” Full Technical Architecture (L1 β†’ L7)


🧭 Level 1 β€” Problem Statement & Context​

Coditect addresses the need for secure, autosaving, multi-runtime sandbox environments that support AI agents, ephemeral developer environments, and dynamic runtime workloads in a zero-trust cloud-native architecture.

Context:

  • Increasing demand for ephemeral compute across AI and software engineering workflows
  • Multi-tenant environments require strong isolation guarantees
  • Git-centric workflows require automatic state capture, snapshots, and traceability
  • Executable environments must span containers (gVisor/Kata), microVMs (Firecracker), and WASM runtimes

🧠 Level 2 β€” High-Level Architecture​

πŸ”Ή Key Components:​

  • Frontend UI (React) β€” sandbox explorer, logs, creation
  • API (FastAPI) β€” JWT-authenticated entrypoint for sandbox lifecycle, autosave, and quota
  • Controller (Go) β€” Kubernetes CRD controller for Sandbox resources
  • Agent (Python or Rust) β€” GCP Workstation-local gRPC server to launch containers in gVisor, Kata, or Wasmtime
  • Infrastructure (OpenTofu) β€” GCP project, GKE, WorkstationConfig, Secret Manager, IAM
  • Autosave Engine β€” GitHub worktree commit/push daemon
  • Monitoring β€” Prometheus, Grafana, Cloud Logging

πŸ” System Flow:​


πŸ” Level 3 β€” Security Architecture​

Identity:​

  • OIDC Login (Firebase/Auth0)
  • JWT Access Tokens (5m, RS256-signed via KMS)
  • Refresh Tokens (7d, Firebase Secure Storage)
  • Agent identity via Workload Identity Federation (WIF)

Network:​

  • All agent<->API traffic secured via mTLS with GKE-managed certs
  • GKE Ingress uses HTTPS, with Cloud Armor IP allowlists
  • No Pod ever exposed directly to users

Secrets:​

  • GitHub tokens stored in Secret Manager with restricted IAM access
  • mTLS root CA pinned in agents and API pods

Authorization:​

  • JWT includes:
{
"tenant_id": "t-xyz",
"user_id": "u-abc",
"sandbox_quota": 1800,
"roles": ["sandboxer"]
}
  • Enforced at:
    • API
    • Agent (metadata validation)
    • Controller (via webhook)

🧩 Level 4 β€” Sandbox Lifecycle & CRD Reconciliation​

Sandbox CRD (Expanded)​

apiVersion: coditect.io/v1alpha1
kind: Sandbox
metadata:
name: sbx-abc123
spec:
runtime: gvisor
tenantID: t-xyz
userID: u-abc
projectID: p-123
image: python:3.11
command: ["python", "main.py"]
timeoutSeconds: 900
limits:
cpu: 1
memoryMiB: 512
networkPolicy:
blockNetwork: true
cidrAllowlist: ["10.0.0.0/8"]
status:
phase: Running
logsURL: https://...
autosaveURL: https://github.com/org/repo/tree/autosave/...

Controller Logic:​

  • Finalizers: sandbox.coditect.io/finalize
  • Pod annotations:
    • sandbox_id, runtime, project_id
  • CRD β†’ Pod or agent call
  • Deletes:
    • Trigger final snapshot
    • Call agent’s TerminateSandbox()

βš™οΈ Level 5 β€” Runtime Isolation & Quota Enforcement​

RuntimeClasses:​

NameIsolation ModelUse Case
gvisorsyscall filter (runsc)medium-trust agents
kata-fcKVM-based microVMuntrusted workloads
wasmtimewasm runtime sandboxwasm toolchains

Quota:​

  • JWT field sandbox_quota
  • API tracks usage in Redis
  • Prometheus alert on >90% quota
  • Rejections return 429 Retry-After

πŸ” Level 6 β€” Git Autosave, Push, and Recovery​

Git Flow:​

  • git worktree add ../_autosave autosave/<ticket>/<tenant>/<user>
  • Commits every 30s via daemon:
git add . && git commit -m 'autosave checkpoint' && git push
  • Final snapshot on destroy
  • Pushes include:
    • sandbox_id
    • UTC timestamp
    • commitURL added to CRD

Failure Modes:​

ScenarioResolution
Token expiredrefresh via Firebase
Branch conflictretry with worktree rebase
API timeoutasync queue push attempt

πŸ” Level 7 β€” Observability, Audit, Threat Modeling​

Prometheus Metrics:​

  • sandbox_create_latency_seconds
  • agent_launch_failures_total
  • quota_usage_ratio{tenant}
  • sandbox_active_total{runtime}

Grafana Dashboards:​

  • CPU + mem usage per sandbox
  • Quota % per user/project
  • Idle sandbox heatmap

Cloud Logs:​

  • API, Agent, Controller emit:
    • sandbox_id
    • user_id
    • trace_id
  • Stored in GCS: coditect-audit-logs

Threat Map:​

ThreatMitigation
Sandbox breakoutRuntimeClass, seccomp, readonly FS
JWT forgeryRS256 KMS signing, short TTL
GitHub token exfiltrationSecret Manager + token TTL + scoping
Workstation impersonationmTLS, WIF, pinned root CA
Excess sandbox abuseToken quota + Prometheus + Redis sync