Skip to main content

target architecture (e.g., Coditect‑style multi‑tenant agents, GPU needs, persistence model), a next step can be a concrete prompt scaffold for: “Design a Modal‑backed sandbox executor service for untrusted agent code (API + orchestration + limits + IAM). ARE MODAL SANDBOXES open source?

Modal Sandboxes themselves (the backend runtime and orchestration) are not open source; only the client SDKs (Python modal, JS/TS/Go libmodal) are open source under Apache‑2.0.1234

What is open vs closed

  • The Modal Python client (modal on PyPI, modal-client on GitHub) is Apache‑2.0 licensed open source.231
  • The JS/TS and Go SDKs (libmodal on GitHub) are also open source SDKs for interacting with Modal.54
  • The Modal Sandboxes product (gVisor‑based isolation, orchestration, scheduling, infra) is a proprietary managed service, not released as open source. Public materials describe it as a product you use via their API, with no backend repo exposed.678

Suggested next-step prompt scaffold

Design a Modal-backed sandbox executor microservice for Coditect that executes untrusted agent code using Modal Sandboxes. Define: (1) a multi-tenant API surface (create_sandbox, exec, snapshot, destroy) with per-tenant isolation and quotas, (2) orchestration flow for routing agent jobs to Modal, handling stdout/stderr/log streaming, and enforcing CPU/GPU, time, and memory limits, (3) an IAM model where Coditect issues short-lived tokens/roles that map tenants and projects to specific Modal apps, volumes, and network policies, and (4) a persistence model for ephemeral vs project-scoped workspaces (Modal volumes, object storage), including audit logging of all executions and artifacts. Use OpenAPI 3.1 for the external API, describe internal sequence diagrams, and propose a minimal RBAC schema for tenants, projects, and sandboxes.

If you want, this can be further specialized for “GPU-heavy codegen and test runs” vs “CPU-only quick tools,” or aligned with your existing Coditect RLM orchestrator API. 91011121314


Design a Modal-like sandbox executor microservice for Coditect that executes untrusted agent code using Modal-like Sandboxes. Define: (1) a multi-tenant API surface (create_sandbox, exec, snapshot, destroy) with per-tenant isolation and quotas, (2) orchestration flow for routing agent jobs to Modal, handling stdout/stderr/log streaming, and enforcing CPU/GPU, time, and memory limits, (3) an IAM model where Coditect issues short-lived tokens/roles that map tenants and projects to specific Modal-like apps, volumes, and network policies, and (4) a persistence model for ephemeral vs project-scoped workspaces (Modal-like volumes, object storage), including audit logging of all executions and artifacts. Use OpenAPI 3.1 for the external API, describe internal sequence diagrams, and propose a minimal RBAC schema for tenants, projects, and sandboxes.

Below is a comprehensive design document for a Modal-like sandbox executor service for Coditect. This architecture leverages patterns from Modal's Sandbox API while adding multi-tenant IAM, quota enforcement, persistence, and audit logging.


OpenAPI 3.1 Specification (Core Endpoints)

openapi: 3.1.0
info:
title: Coditect Sandbox Executor API
version: 1.0.0
description: Multi-tenant sandbox executor for untrusted agent code

servers:
- url: https://api.coditect.io/v1

security:
- BearerAuth: []

paths:
/sandboxes:
post:
operationId: createSandbox
summary: Create a new sandbox
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/CreateSandboxRequest'
responses:
'201':
description: Sandbox created
content:
application/json:
schema:
$ref: '#/components/schemas/Sandbox'
'403':
description: Quota exceeded or permission denied

/sandboxes/{sandbox_id}/exec:
post:
operationId: execCommand
summary: Execute command in sandbox
parameters:
- $ref: '#/components/parameters/SandboxId'
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/ExecRequest'
responses:
'200':
description: Execution started
content:
application/json:
schema:
$ref: '#/components/schemas/ExecHandle'

/sandboxes/{sandbox_id}/snapshot:
post:
operationId: snapshotFilesystem
summary: Snapshot sandbox filesystem to image
parameters:
- $ref: '#/components/parameters/SandboxId'
responses:
'201':
description: Snapshot created
content:
application/json:
schema:
$ref: '#/components/schemas/SnapshotResult'

/sandboxes/{sandbox_id}:
delete:
operationId: destroySandbox
summary: Terminate and destroy sandbox
parameters:
- $ref: '#/components/parameters/SandboxId'
responses:
'204':
description: Sandbox terminated

/sandboxes/{sandbox_id}/streams:
get:
operationId: streamLogs
summary: SSE stream of stdout/stderr
parameters:
- $ref: '#/components/parameters/SandboxId'
responses:
'200':
description: Event stream
content:
text/event-stream:
schema:
$ref: '#/components/schemas/LogEvent'

components:
securitySchemes:
BearerAuth:
type: http
scheme: bearer
bearerFormat: JWT

parameters:
SandboxId:
name: sandbox_id
in: path
required: true
schema:
type: string
format: uuid

schemas:
CreateSandboxRequest:
type: object
required: [project_id]
properties:
project_id:
type: string
format: uuid
name:
type: string
maxLength: 64
pattern: '^[a-zA-Z0-9._-]+$'
image:
type: string
description: Image reference or snapshot ID
workdir:
type: string
env:
type: object
additionalProperties:
type: string
secrets:
type: array
items:
type: string
description: Secret references to inject
volumes:
type: object
additionalProperties:
type: string
description: Mount path -> volume ID mapping
limits:
$ref: '#/components/schemas/ResourceLimits'
network_policy:
$ref: '#/components/schemas/NetworkPolicy'
timeout:
type: integer
minimum: 1
maximum: 86400
default: 300
idle_timeout:
type: integer
minimum: 30
maximum: 3600

ResourceLimits:
type: object
properties:
cpu:
oneOf:
- type: number
- type: array
items:
type: number
minItems: 2
maxItems: 2
memory_mib:
oneOf:
- type: integer
- type: array
items:
type: integer
minItems: 2
maxItems: 2
gpu:
type: string
enum: [none, t4, a10g, a100, h100]

NetworkPolicy:
type: object
properties:
block_network:
type: boolean
default: false
cidr_allowlist:
type: array
items:
type: string
format: cidr

ExecRequest:
type: object
required: [command]
properties:
command:
type: array
items:
type: string
minItems: 1
workdir:
type: string
env:
type: object
additionalProperties:
type: string
timeout:
type: integer
pty:
type: boolean
default: false

Sandbox:
type: object
properties:
id:
type: string
format: uuid
tenant_id:
type: string
project_id:
type: string
name:
type: string
status:
type: string
enum: [creating, running, terminated, failed]
created_at:
type: string
format: date-time
tunnels:
type: object
additionalProperties:
$ref: '#/components/schemas/Tunnel'

ExecHandle:
type: object
properties:
exec_id:
type: string
sandbox_id:
type: string
stream_url:
type: string
format: uri

SnapshotResult:
type: object
properties:
snapshot_id:
type: string
image_ref:
type: string
size_bytes:
type: integer

LogEvent:
type: object
properties:
stream:
type: string
enum: [stdout, stderr, system]
data:
type: string
timestamp:
type: string
format: date-time

Tunnel:
type: object
properties:
port:
type: integer
url:
type: string
format: uri

Orchestration Flow (Sequence Diagram)

┌─────────┐    ┌────────────┐    ┌──────────────┐    ┌─────────────┐    ┌────────────┐
│ Agent │ │ API Gateway│ │ Orchestrator │ │ Sandbox Pool│ │ Modal-like │
│ (client)│ │ (IAM) │ │ Service │ │ Manager │ │ Backend │
└────┬────┘ └─────┬──────┘ └──────┬───────┘ └──────┬──────┘ └─────┬──────┘
│ │ │ │ │
│──POST /sandboxes──────────────>│ │ │
│ │ │ │ │
│ │<──validate JWT, extract tenant/project/roles── │
│ │ │ │ │
│ │──CreateSandboxCmd (with quotas)───>│ │
│ │ │ │ │
│ │ │──check tenant quota, reserve───>│ │
│ │ │ │ │
│ │ │ │──Sandbox.create()──>│
│ │ │ │ (image, limits, │
│ │ │ │ network_policy) │
│ │ │ │<───sandbox_id────────│
│ │ │<──sandbox_id, status─────────────│ │
│<─────────────201 {sandbox}──────│ │ │
│ │ │ │ │
│──POST /sandboxes/{id}/exec────>│ │ │
│ │ │──ExecCmd───────>│ │
│ │ │ │──sb.exec(cmd)──────>│
│ │ │ │<──exec_handle───────│
│ │ │<──exec_id, stream_url────────────│ │
│<─────────────200 {exec_handle}──│ │ │
│ │ │ │ │
│──GET /sandboxes/{id}/streams ─>│ │ │
│ │ │──subscribe to log stream────────>│ │
│<─────────────SSE: stdout/stderr lines──────────────│<──streaming────│ │
│ │ │ │ │
│──DELETE /sandboxes/{id}───────>│ │ │
│ │ │──TerminateCmd──>│ │
│ │ │ │──sb.terminate()────>│
│ │ │ │<──ack───────────────│
│ │ │──release quota──│ │
│<─────────────204─────────────────│ │ │

Key orchestration points from Modal's patterns:

  • Sandbox.create() allocates a container with specified image, volumes, CPU/GPU/memory limits, and network policy (block_network, cidr_allowlist) .
  • Sandbox.exec() runs commands inside the sandbox, returning a handle for streaming stdout/stderr .
  • Sandbox.snapshot_filesystem() persists the current filesystem state as a reusable image .
  • Sandbox.terminate() cleans up resources .

IAM Model

Token Structure (JWT Claims)

{
"sub": "user:uuid",
"tenant_id": "tenant:uuid",
"project_ids": ["proj:uuid1", "proj:uuid2"],
"roles": ["sandbox:execute", "sandbox:snapshot", "volume:read"],
"quota_tier": "standard",
"network_policy_override": null,
"exp": 1705212000,
"iss": "coditect-iam"
}

Policy Enforcement Points

CheckEnforced AtDescription
Token validityAPI GatewayJWT signature, expiry, issuer 15
Tenant isolationOrchestratorSandbox tagged with tenant_id, query filters
Project scopeOrchestratorproject_id must be in token's project_ids
Role permissionOrchestratorAction mapped to required role (see RBAC below)
Quota enforcementPool ManagerConcurrent sandboxes, GPU allocation per tenant 16
Network policyBackendcidr_allowlist, block_network applied at create

Secret Injection

Secrets are referenced by name (e.g., secrets: ["hf-token", "wandb-key"]) and resolved server-side using tenant-scoped secret stores. The orchestrator injects them as environment variables via Modal-style Secret.from_dict() pattern.


RBAC Schema

Roles

RolePermissions
sandbox:createCreate sandboxes in assigned projects
sandbox:executeRun exec commands in owned sandboxes
sandbox:snapshotCreate filesystem snapshots
sandbox:destroyTerminate sandboxes
sandbox:adminAll sandbox operations + list all tenant sandboxes
volume:readMount volumes as read-only
volume:writeMount volumes with read-write access
secret:useReference secrets for injection
network:unrestrictedOverride default network policies
gpu:requestRequest GPU resources

Role Hierarchy

tenant_admin
└── project_admin
├── sandbox:admin
├── volume:write
├── secret:use
└── gpu:request
└── sandbox:create
├── sandbox:execute
├── sandbox:snapshot
└── sandbox:destroy

Resource Bindings

# Example: User role binding
bindings:
- principal: user:alice-uuid
tenant: tenant:acme-uuid
projects: [proj:agent-runner]
roles:
- sandbox:create
- sandbox:execute
- sandbox:snapshot
- volume:read
- secret:use
quotas:
max_concurrent_sandboxes: 5
max_gpu_hours_per_day: 10
allowed_gpu_types: [t4, a10g]

Persistence Model

Workspace Types

TypeLifecycleBackendUse Case
EphemeralSandbox lifetimetmpfs / overlayUntrusted temp work, scratch 17
Project VolumePersistentModal Volume / GCS FUSEShared project artifacts, datasets
Snapshot ImageImmutableContainer registryCheckpoint/restore, reproducibility

Volume Mounting (from Modal pattern)

# Orchestrator maps tenant request to Modal Volume
volumes = {
"/workspace": modal.Volume.from_name(f"{tenant_id}/{project_id}/workspace"),
"/data": modal.CloudBucketMount(f"gs://coditect-{tenant_id}-data", read_only=True)
}
sb = modal.Sandbox.create(app=tenant_app, volumes=volumes, ...)

Snapshot Flow

1. Agent requests POST /sandboxes/{id}/snapshot
2. Orchestrator calls sb.snapshot_filesystem()
3. Backend captures overlay diff → creates Image artifact
4. Snapshot metadata stored: {snapshot_id, tenant_id, project_id, parent_image, size, created_at}
5. Snapshot can be used as `image` in subsequent CreateSandboxRequest

Audit Logging Schema

Every sandbox operation emits an audit event to a write-ahead log (e.g., Pub/Sub → BigQuery or FoundationDB):

{
"event_id": "uuid",
"timestamp": "2026-01-14T06:47:00Z",
"event_type": "sandbox.exec",
"tenant_id": "tenant:acme",
"project_id": "proj:agent-runner",
"actor": {
"type": "user|service_account|agent",
"id": "uuid",
"ip": "203.0.113.42"
},
"resource": {
"type": "sandbox",
"id": "sb-xyz",
"name": "codegen-worker-1"
},
"action": {
"command": ["python", "run_tests.py"],
"workdir": "/workspace",
"timeout": 120
},
"result": {
"status": "success|failure|timeout",
"exit_code": 0,
"duration_ms": 4523
},
"resources_consumed": {
"cpu_seconds": 8.2,
"memory_mib_peak": 512,
"gpu_seconds": 0
}
}

Audit events support compliance queries: "Show all executions by agent X in project Y over the last 7 days with GPU usage."


Resource Limits Enforcement

LimitEnforcement MechanismDefault
TimeoutSandbox auto-terminate after N seconds300s (max 24h)
Idle timeoutTerminate if no exec/stdin/tunnel activity60s
CPU(request, limit) tuple → throttling(0.5, 2.0)
Memory(request, limit) MiB → OOM kill(256, 1024)
GPUType selection + quota checknone
Concurrent sandboxesPer-tenant counter in Pool Manager10
Networkblock_network / cidr_allowlistallowlist: []

Suggested Follow-Up Prompts

  1. gVisor Sentry syscall policy tuning: "Design a custom gVisor seccomp profile for Coditect sandboxes that allows common ML workloads (PyTorch, JAX) while blocking high-risk syscalls. Include audit logging for denied calls."
  2. Quota billing integration: "Extend the Coditect sandbox API with usage metering endpoints and a Stripe-compatible billing webhook for CPU/GPU/storage consumption per tenant."
  3. Warm pool pre-provisioning: "Design a sandbox warm pool strategy for Coditect that pre-provisions N sandboxes per (tenant, image) tuple to reduce cold-start latency, including eviction policies and cost optimization."

Would you like me to drill into any of these areas or generate implementation stubs (Rust/Python/Go) for specific components? 18192021222324252627282930


WHAT MODAL-LIKE technologies already exist which can provide the technical foundation for the MODAL-LIKE CODITECT sandbox technology platfrom.

Several “Modal-like” foundations already exist that you can either build on directly or crib heavily from for a Coditect self-hosted sandbox layer.313233

Managed sandbox platforms (API-level inspirations)

These give you patterns for API, orchestration, and DX even if you don’t consume them directly.

  • Northflank Sandboxes / workloads – MicroVM (Kata or cloud-hypervisor) and gVisor isolation, any OCI image, BYOC deployments in your own cloud, and persistent volumes; marketed explicitly as a Modal alternative for secure AI code execution.31
  • E2B.dev – Firecracker-based microVM sandboxes focused on AI agents, with per-session workspaces, SDKs, and 24‑hour max lifetimes. Good reference for “agent-native” sandbox ergonomics.3431
  • Daytona – Fast-provisioning Docker/Kata sandboxes for AI workflows, sub‑90ms spin-up, limited persistence but strong “ephemeral dev env / AI run” mental model.353431
  • Cloudflare Workers + AI Sandbox SDK – V8 isolate boundary (no full Linux) but very strong example of capability-based APIs, no direct filesystem, and extremely fast cold starts.3431
  • Vercel Sandbox (beta) – Firecracker-based ephemeral sandboxes tightly integrated with their platform; 45‑minute limits and “preview” semantics are a useful pattern for time-bounded untrusted code.3631
  • InstaVM, Koyeb, RunPod et al. – Several GPU-oriented platforms now expose “code execution” sandboxes and serverless containers; Koyeb and RunPod are both called out as Modal alternatives for AI workloads.373834

Universal sandbox API libraries

These are especially relevant if you want Coditect to orchestrate multiple backends (Modal, E2B, Daytona, self-hosted, etc.) behind one interface.

  • Cased sandboxes – Open source Python library + CLI that provides a universal API for multiple cloud sandbox providers (Modal, E2B, Daytona, Cloudflare, etc.), with provider selection, failover, sandbox reuse, labels, image selection, and streaming output.3239
    • Example API: async with Sandbox.create(provider="modal") as sandbox: await sandbox.execute("python analyze.py").32
    • This is very close to the Coditect “multi-provider executor” story; you could mirror its provider abstraction while swapping in your own control plane.32

Architectural primitives (self-hosted building blocks)

Luis Cardoso’s “Field guide to sandboxes for AI” lays out clear decision criteria and suggests concrete tech stacks for AI coding agents.4033

  • gVisor – User-space kernel interception for hardened containers; good middle ground if you already run Kubernetes and want better isolation than plain containers without going full microVM.33
  • Kata Containers / Firecracker / cloud-hypervisor – MicroVM isolation that Cardoso recommends explicitly for multi-tenant AI coding agents (hostile code, full Linux semantics).33
  • Kubernetes + CRI runtimes – Use containerd + Kata runtime class or Firecracker-based runtimes to create per-job microVM sandboxes managed via standard K8s Jobs/Pods API. Cardoso walks through how CRI talks to Kata to spin a microVM per pod.33

For a Coditect “Modal-like but self-hosted” design, you can:

  • Use Kubernetes + Kata/Firecracker as the isolation substrate.33
  • Layer a sandbox control plane that exposes /sandboxes, /exec, /snapshot, /streams on top of K8s (like a focused PaaS).3133
  • Optionally integrate with or borrow patterns from Cased sandboxes to get multi-provider abstractions and fallback to external services where appropriate.3932

Decision heuristics you can reuse

The field guide offers a practical decision table you can almost lift into Coditect docs:

  • For multi-tenant AI agent execution with hostile code and full Linux needs, choose microVM (Firecracker / cloud-hypervisor) as the boundary.33
  • If you’re constrained to existing Kubernetes infra and semi-trusted workloads, gVisor-hardened containers are acceptable.33
  • Minimum viable policy: default-deny egress with allowlist, short-lived scoped credentials, workspace-only filesystem, hard CPU/mem/timeout limits, and detailed telemetry (process tree + network).33

Suggested next prompt

Design a self-hosted Coditect sandbox runtime on Kubernetes using Kata/Firecracker microVMs as the isolation boundary, taking inspiration from Cased sandboxes’ provider abstraction and Modal’s Sandbox API surface. Specify the CRDs, controller logic, and mapping from the Coditect /sandboxes HTTP API to K8s Jobs/Pods, including quota enforcement and network policies. 4142434445


what open source can I leverage, I do not want proprietary 3rd party solutions as part of the stack

For a Coditect-owned, non-proprietary sandbox stack, you can assemble everything from open source components: isolation runtime, orchestrator, and developer-facing API.4647

Isolation runtimes (core “Modal-like” boundary)

These give you the secure execution boundary for untrusted agent code.

  • gVisor – User-space kernel implementing the Linux syscall interface, designed as a sandbox runtime between containers and the host kernel; used in several “untrusted code” runners.4849
  • Kata Containers – OCI-compatible runtime that runs each pod/container in a lightweight VM; effectively “Firecracker with Kubernetes support,” giving VM-grade isolation per sandbox.5048
  • Firecracker – MicroVM VMM originally from AWS, widely recommended as the default for multi-tenant AI coding agents that need full Linux, shell, and package managers.4748
  • Wasm runtimes (for capability-scoped tools) – Wasmtime, Wasmer, and WasmEdge can be used when you can constrain workloads to WASI, avoiding full Linux while gaining strong isolation and fast startup.47

These can all be wired under containerd/CRI on Kubernetes, so that each Coditect sandbox maps to a Pod using a specific runtime class (e.g., Kata for microVM, gVisor for hardened containers).5147

Orchestration and execution frameworks

These provide patterns or code for creating “submit code → run in sandbox → stream output” workflows.

  • Sandman (jakhax/sandman) – gVisor-based code execution service that runs and tests untrusted code inside a container sandbox; good reference for using gVisor as an isolation layer and discussing security tradeoffs.49
  • Awesome-sandbox list – Curated overview of modern sandboxing solutions, with entries for e2b, Daytona, and others that show patterns for AI-agent runtimes even if you don’t use their hosted offerings.46
  • Chris Hay’s Code Sandbox MCP server – Not a full infra runtime, but a concrete example of a code-execution service exposing a clean tool protocol (MCP) to LLM clients, which you can mirror with your own backend.52

Universal sandbox API (open source)

Even if you don’t call third-party providers, Cased’s library is a strong reference for the shape of a provider-agnostic sandbox API.

  • Cased sandboxes (MIT-licensed) – Open-source Python library + CLI that defines a universal interface for cloud code execution sandboxes (create, execute, reuse, labels, streaming output).5354
    • Shows how to abstract “provider” as a pluggable backend and implement a uniform Sandbox.create(...).execute(...) API that Coditect can replicate while routing to your own gVisor/Kata/Firecracker control plane.53

Design guidance from open references

For high-level architecture choices and tradeoffs, you can lean on recent sandbox guides.

  • Luis Cardoso’s “Field guide to sandboxes for AI” – Open article that compares containers, gVisor, microVMs, and Wasm, and recommends microVMs (Firecracker/cloud-hypervisor) as the default for multi-tenant AI coding agents, with gVisor as a compromise when you’re already on K8s.47
  • Simon Willison’s mirror / commentary – Reinforces the same taxonomy and serves as an additional reference for justifying your architecture and threat model to stakeholders.5556

Concrete stack suggestion for Coditect

All-open stack, no proprietary services:

  • Boundary: Firecracker or Kata microVMs (for multi-tenant hostile code) plus gVisor for lighter, semi-trusted workloads.4847
  • Orchestration: Kubernetes with runtime classes for Kata/gVisor, plus a custom Coditect sandbox controller and CRD (Sandbox, SandboxExec) that maps your HTTP API to K8s Jobs/Pods.51
  • API / DX layer: Coditect “sandbox executor” service modeled on Cased sandboxes API patterns (create, execute, reuse, labels, streaming) but backed solely by your own runtimes.5453

If you want, the next step can be: “Design a Kubernetes-based Coditect Sandbox CRD and controller that uses Kata Containers as the runtime, including the mapping from HTTP /sandboxes to CRD, and how to plug in gVisor/Firecracker as alternative runtime classes.” 57585960


RESEARCH EACH OF THESE in DEPTH: COMPARE AND CONTRAST, PROs AND CONs solation runtimes (core “Modal-like” boundary)

These give you the secure execution boundary for untrusted agent code. gVisor – User-space kernel implementing the Linux syscall interface, designed as a sandbox runtime between containers and the host kernel; used in several “untrusted code” runners.​ Kata Containers – OCI-compatible runtime that runs each pod/container in a lightweight VM; effectively “Firecracker with Kubernetes support,” giving VM-grade isolation per sandbox.​ Firecracker – MicroVM VMM originally from AWS, widely recommended as the default for multi-tenant AI coding agents that need full Linux, shell, and package managers.​ Wasm runtimes (for capability-scoped tools) – Wasmtime, Wasmer, and WasmEdge can be used when you can constrain workloads to WASI, avoiding full Linux while gaining strong isolation and fast startup.​

All four can underpin a “Modal-like” sandbox layer, but they make different tradeoffs on isolation strength, compatibility, performance, and complexity.61626364

Snapshot: when to use what

RuntimeIsolation strengthLinux compatibilityStartup / densityBest fit in Coditect
gVisorHigher than containers, lower than VMs 62Very high, but some syscalls slower/unsupported 6566Near-container startup, good density 62Semi-trusted multi-tenant, “secure containers” on K8s
KataVM-grade, per-container microVM 676863Full Linux, OCI-compatible 6763Slower than containers; faster than full VMs 63High-risk tenants; “secure pod” profile on K8s
FirecrackerVery strong, FaaS-level isolation 697064Full Linux inside guest, but custom integration 6970100–125 ms spin-up, huge density 6964Your own Lambda/Modal-style pool for untrusted agents
Wasm runtimes (Wasmtime/Wasmer/WasmEdge)Very strong per-module memory + capability isolation 61Limited to WASI / host APIs; no full Linux 61Microseconds–ms startup, extremely high density 6171Capability-scoped tools, sandboxes for constrained languages

gVisor

What it is

  • A user-space kernel that implements the Linux syscall interface and sits between containers and the host kernel; it “implements Linux by way of Linux” by intercepting syscalls in a sentry process.666272
  • Deployed as a container runtime sandbox (e.g., runsc), including integration with Kubernetes and GKE Sandbox; often described as “seccomp on steroids.”6562

Pros

  • Better isolation than plain containers: host kernel surface exposed to the workload is drastically reduced; syscalls are handled by the user-space kernel rather than directly by the host.6265
  • Lightweight footprint vs VMs: no guest OS to boot, no per-VM kernel; starts fast and scales like containers while adding an isolation boundary.62
  • Works without hardware virtualization: no need for KVM support, so easier in nested virtualization environments or constrained clouds.62
  • Kubernetes-native: can be plugged in as a runtime class and selectively applied to pods that need extra isolation.62

Cons

  • Not VM-grade isolation: still shares the host kernel; a gVisor escape is less likely than a vanilla container escape, but the blast radius is larger than with Firecracker/Kata microVMs.7362
  • Performance overhead: syscall-heavy workloads pay a noticeable tax; each syscall goes through the user-space kernel.6562
  • Compatibility quirks: some low-level kernel features, /proc behavior, or exotic syscalls may be missing or behave differently, which can surprise deep Linux tooling.6665

When it shines for Coditect

  • Multi-tenant but semi-trusted agent code (e.g., internal teams, controlled languages) where you want better isolation than containers but don’t want to pay microVM costs.6162
  • You already have Kubernetes and want to opt-in sandboxing via a runtimeClass on selected workloads.62

Kata Containers

What it is

  • An open-source runtime that runs each “container” inside its own minimal VM, combining container UX with VM isolation.676863
  • Integrates with Docker/Kubernetes using OCI and CRI, with a runtime plus CRI-friendly shim/library.6367

Pros

  • VM-grade isolation: each pod/container gets its own guest kernel and VM boundary, significantly reducing cross-tenant risk compared to shared-kernel containers.6763
  • Kubernetes/OCI compatible: drop-in runtime that lets you run Kata and standard containers in the same cluster, choosing per-workload isolation.6367
  • Supports multiple VMMs: can use Firecracker or Cloud Hypervisor under the hood, so you get microVM characteristics with K8s integration.6863

Cons

  • Higher overhead than containers: you pay for a guest kernel and VM per sandbox; memory footprint per workload is larger.63
  • Slower cold starts than containers: still typically faster than traditional VMs, but slower than gVisor/container-only setups.63
  • Operational complexity: more moving parts (runtime, agent, hypervisor), guest kernel management, and debugging complexity vs plain containers.63

When it shines for Coditect

  • High-risk, multi-tenant untrusted code (public SaaS) where you want strong isolation but also Kubernetes-native control and scheduling.686763
  • You want a “secure pod” class: map Coditect “high-risk sandboxes” to a K8s runtimeClass that uses Kata, keeping lower-risk workloads on gVisor or runc.63

Firecracker

What it is

  • An open-source microVM VMM built by AWS, designed for secure, multi-tenant container and function workloads with minimal overhead.7064
  • Used under AWS Lambda and Fargate to start thousands of microVMs per second with ~100–125 ms cold-start times and as low as ~5 MB memory footprint per microVM.6964

Pros

  • Very strong isolation: each microVM has its own kernel and minimal device model, tailored for security and multi-tenancy.6469
  • Purpose-built for FaaS/serverless: start thousands of microVMs per second, with cold starts competitive with containers; ideal for short-lived, untrusted code.6964
  • Minimal footprint: small memory and device surface compared to general-purpose hypervisors.6469

Cons

  • Lower-level integration effort: unlike Kata, Firecracker doesn’t come with built-in Kubernetes integration; you must integrate via containerd plugins or build your own control plane.6864
  • Guest VM management: you must manage guest OS images, kernels, and per-VM boot config, similar to running VMs at scale.6964
  • More opinionated: limited device model and focus on network+block devices can complicate some advanced workloads (e.g., complex PCI passthrough).7069

When it shines for Coditect

  • A Modal-like / Lambda-like executor: Coditect runs each agent sandbox in a Firecracker microVM, with its own VM pool, warm instances, and very tight per-tenant isolation.6469
  • You’re willing to build a custom control plane (or K8s integration) and want direct control of microVM lifecycle, warm pools, and scheduling.7069

Wasm runtimes (Wasmtime, Wasmer, WasmEdge)

What they are

  • WebAssembly runtimes that execute Wasm modules with linear memory and no ambient access: all host interactions must be explicitly imported.61
  • Often support WASI (WebAssembly System Interface) for POSIX-like capabilities and provide resource metering (“fuel”) for deterministic preemption.61

Pros

  • Strong memory and capability isolation: modules can’t touch arbitrary host memory or the OS unless explicitly allowed; great fit for capability-based “tools.”61
  • Very fast startup and high density: no guest OS, no VM boot; instantiation is microseconds–milliseconds.7161
  • Fine-grained resource control: e.g., Wasmtime’s fuel mechanism for instruction metering, making runtime limits more deterministic than “CPU time + signals” alone.61
  • Language reach: multiple languages compile to Wasm (Rust, TinyGo, C/C++, Zig, some Python/JS subsets), which can be used for extensions and plugins.61

Cons

  • No full Linux: many agent workloads assume POSIX, /proc, apt, arbitrary shells, etc., which do not exist in pure WASI environments.61
  • Ecosystem gaps: not every language or library is readily portable; some ML stacks or system tools won’t work without heavy adaptation.61
  • Host API design burden: you must define a careful host surface (I/O, network, filesystem) and then maintain that ABI over time.61

When it shines for Coditect

  • Capability-scoped tools and plugins: “safe” tools where Coditect defines a Wasm ABI (e.g., limited file access, HTTP client) and agents can only call those.61
  • Single-tenant or internal deployments where you want Maximum density and safety for specific languages (Rust, Go) without the full complexity of Linux sandboxing.7161

How to combine them in a Coditect architecture

A practical hybrid that lines up with current best practice for AI sandboxes:64626361

  • Tier 1 (light sandbox) – gVisor runtime class on Kubernetes:
    • For internal or semi-trusted code where container-level semantics are enough; used for “fast dev” sandboxes.
  • Tier 2 (strong sandbox) – Kata Containers (possibly backed by Firecracker):
    • For external tenants and highly untrusted code; Coditect maps “high-risk agent sessions” to Kata runtimeClass pods.
  • Tier 3 (serverless microVM) – Firecracker pool:
    • For Modal-style, bursty agent workloads: a separate executor service that manages Firecracker microVMs directly, with warm pools and hard quotas.
  • Tier 4 (tool sandbox) – Wasm runtime (Wasmtime/Wasmer/WasmEdge):
    • For Coditect’s “capability-scoped tools” where you control the ABI; used for plugin-style extensions and deterministically metered workloads.

If you want, the next step can be a detailed design prompt like:

Design a tiered Coditect sandbox architecture that uses gVisor, Kata Containers (with Firecracker), and Wasmtime in a single Kubernetes cluster. Specify how the /sandboxes API selects a runtime tier based on tenant, project, and risk level, and how to share common observability, quota, and audit logging across all runtimes. 747576


Design a tiered Coditect sandbox architecture that uses gVisor, Kata Containers (with Firecracker), and Wasmtime in a single Kubernetes cluster. Specify how the /sandboxes API selects a runtime tier based on tenant, project, and risk level, and how to share common observability, quota, and audit logging across all runtimes.

Design a 3‑tier sandbox architecture by treating the “runtime choice” as a policy decision at the /sandboxes control plane, and using Kubernetes RuntimeClasses to route pods to gVisor, Kata+Firecracker, or Wasm-backed workloads.7778798081


1. Runtime tiers and Kubernetes setup

Runtime classes

Define three RuntimeClass objects for Linux-based sandboxes, plus a Wasm integration (via CRI plugin or sidecar).8283808477

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: gvisor
handler: runsc # gVisor handler [web:73][web:65]
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: kata-fc
handler: kata-fc # Kata Containers using Firecracker [web:66][web:78][web:84]
---
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: runc
handler: runc # baseline container runtime

For Wasm, follow CNCF’s “Wasm on Kubernetes” pattern, using either:8581

  • Wasm containers: using a Wasm-aware runtime (e.g., cri-o/containerd shim) and an annotation like module.wasm.image/variant.81
  • Sidecar pattern: run a Wasmtime/Wasmer sidecar that executes Wasm modules on demand next to a thin HTTP gRPC proxy container.8581

2. /sandboxes API and runtime selection

API surface

You keep a single tenant-facing HTTP API, with an explicit but optional risk_profile and runtime_hint that the control plane resolves to a runtime tier:

POST /sandboxes
{
"project_id": "proj-uuid",
"name": "agent-run-123",
"image": "ghcr.io/coditect/agent-runner:latest",
"risk_profile": "untrusted_public | semi_trusted | internal",
"runtime_hint": "auto | gvisor | kata | wasm",
"workload_type": "linux_full | wasm_tool",
"limits": { "cpu": 1.0, "memory_mib": 1024, "gpu": "none" },
"network_policy": { "block_network": true },
"code": {
"language": "python",
"entrypoint": "main.py"
}
}

Policy engine

On POST /sandboxes, the Coditect sandbox controller:

  1. Authenticates the caller and loads tenant + project configuration (risk tier, allowed runtimes).
  2. Computes an effective runtime tier (gVisor / Kata+FC / Wasm) based on:
    • Tenant risk classification (e.g., “external SaaS”, “internal corp”).
    • Project tag (e.g., project.security_level = high).
    • Requested runtime_hint and workload_type.
  3. Maps tier to implementation: Kubernetes RuntimeClass for Linux workloads, or Wasm pipeline for capability-scoped tools.78797781

Example pseudo-logic:

def choose_runtime(tenant, project, req):
# 1. Wasm tools get routed to Wasm
if req.workload_type == "wasm_tool":
return "wasm"

# 2. Force Kata+Firecracker for high-risk external tenants
if tenant.risk == "external" or project.flags.get("requires_vm_isolation"):
return "kata-fc"

# 3. Respect explicit hint if allowed
if req.runtime_hint == "gvisor" and "gvisor" in tenant.allowed_runtimes:
return "gvisor"
if req.runtime_hint == "kata" and "kata-fc" in tenant.allowed_runtimes:
return "kata-fc"

# 4. Default policy
if tenant.risk == "internal":
return "gvisor"
else:
return "kata-fc"

3. Mapping to Kubernetes and Wasm

3.1 gVisor tier (semi-trusted)

For runtime = "gvisor", the controller creates a Pod with the gvisor runtimeClassName.838682

apiVersion: v1
kind: Pod
metadata:
name: sb-123
labels:
coditect.sandbox/id: "sb-123"
coditect.tenant/id: "tenant-abc"
spec:
runtimeClassName: gvisor
containers:
- name: sandbox
image: ghcr.io/coditect/agent-runner:latest
command: ["sleep", "infinity"]
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "1Gi"

gVisor provides an extra boundary beyond runc while still running as fast, OCI-compliant containers.867883

3.2 Kata + Firecracker tier (untrusted/public)

For runtime = "kata-fc", the controller creates Pods using the kata-fc RuntimeClass; Kata then uses Firecracker under the hood.79808788

apiVersion: v1
kind: Pod
metadata:
name: sb-456
labels:
coditect.sandbox/id: "sb-456"
spec:
runtimeClassName: kata-fc
containers:
- name: sandbox
image: ghcr.io/coditect/agent-runner:latest
command: ["sleep", "infinity"]
resources:
requests:
cpu: "500m"
memory: "512Mi"

This gives you VM-grade isolation and microVM characteristics (fast boot, low footprint) while still scheduling via Kubernetes.80878879

3.3 Wasm tier (capability tools)

For runtime = "wasm", Coditect does not spin a full Linux sandbox. Instead it:

  • Deploys a Wasm executor service (Deployment + Service) with a Wasmtime/Wasmer runtime.
  • /sandboxes returns a logical sandbox ID, but the “exec” calls go to the Wasm executor’s HTTP/gRPC API.

You can either:8185

  • Run the executor as a sidecar next to a thin API container, or
  • Use a Wasm container runtime (e.g., annotation-based selection as shown in CNCF’s guide).81

4. Shared observability, quota, and audit logging

4.1 Central metadata and audit log

Maintain a cluster-agnostic metadata store (e.g., PostgreSQL, FoundationDB) with a sandboxes table recording tenant, project, runtime tier, and lifecycle state:

CREATE TABLE sandboxes (
id uuid PRIMARY KEY,
tenant_id uuid NOT NULL,
project_id uuid NOT NULL,
runtime_tier text NOT NULL, -- gvisor | kata-fc | wasm
k8s_namespace text,
k8s_pod_name text,
status text,
created_at timestamptz,
terminated_at timestamptz
);

Every API action (create, exec, snapshot, destroy) writes to an append-only audit_log table or event stream (Pub/Sub, Kafka).8986

CREATE TABLE sandbox_audit_log (
event_id uuid PRIMARY KEY,
sandbox_id uuid NOT NULL,
tenant_id uuid NOT NULL,
event_type text NOT NULL, -- create | exec | destroy | snapshot
runtime_tier text NOT NULL,
actor jsonb,
details jsonb,
timestamp timestamptz NOT NULL
);

Because runtime tier is just metadata, you can query “all exec events across gVisor/Kata/Wasm for tenant X” uniformly.

4.2 Common telemetry stack

Instrument all runtimes with the same observability layer:

  • Logs:
    • Use a cluster-wide log collector (Fluent Bit, Vector, OpenTelemetry Collector) to ship container stdout/stderr to a common log backend (e.g., Loki, Elasticsearch, GCP Logging).89
    • Use consistent labels: tenant_id, project_id, sandbox_id, runtime_tier.
  • Metrics:
    • Expose K8s metrics (CPU/mem usage per Pod) and Wasm executor metrics via Prometheus.
    • Implement per-sandbox metrics: CPU seconds, memory peak, exec duration.
  • Traces:
    • Instrument /sandboxes API, controller, and executor with OpenTelemetry spans, including attributes like coditect.runtime_tier.

4.3 Quota enforcement

Implement a quota service used by the API controller before creating or executing a sandbox, regardless of runtime:

  • Per-tenant limits:
    • max_concurrent_sandboxes[gvisor], max_concurrent_sandboxes[kata-fc], max_concurrent_sandboxes[wasm].
    • Daily CPU/GPU time per tier.

Flow on POST /sandboxes:

  1. Controller computes runtime_tier.
  2. Calls quota_service.reserve(tenant_id, runtime_tier, requested_limits) (atomic).
  3. If accepted, create Pod or logical sandbox; on failure, return 403/429.
  4. On sandbox termination, the controller calls quota_service.release(...).

Because this is tier-agnostic (only keyed by runtime_tier), quotas work similarly whether the sandbox is a Pod using gVisor/Kata or a Wasm module in the executor.


5. High-level sequence for /sandboxes with tiered runtimes

Client ──POST /sandboxes──────────────────────────► API Gateway
(risk_profile, runtime_hint, workload_type)
Gateway ──JWT validate, tenant/project lookup─────► Sandbox Controller
Controller ──compute runtime tier─────────────────► Policy Engine
◄──────────────────────────────────────
Controller ──quota.reserve(tenant, tier, limits)─► Quota Service
◄──ok / reject─────────────────────────
[if rejected → 403]

[if tier == gvisor or kata-fc]
Controller ──create Pod with runtimeClassName────► Kubernetes API
◄──Pod object──────────────────────────
Controller ──record in metadata & audit_log──────► DB/Event Bus

[if tier == wasm]
Controller ──create logical sandbox in Wasm exec► Wasm Executor
◄──sandbox_id─────────────────────────
Controller ──record in metadata & audit_log──────► DB/Event Bus

Controller ──201 {sandbox_id, runtime_tier,...}─► Client

Exec, snapshot, and destroy operations follow the same pattern but call either the Kubernetes API (for gVisor/Kata pods) or the Wasm executor’s API, while always writing to shared audit and using shared quota bookkeeping.

If you want, the next iteration can focus on:

  • A CRD-based design (Sandbox, SandboxExec) instead of direct Pod creation, or
  • A concrete OpenAPI 3.1 spec extension that exposes runtime_tier and policy decisions back to the client for debugging and observability. 9091

Define security and performance tradeoffs for each runtime tier

Example RuntimeClass YAML for gVisor Kata and Wasmtime tiers How to configure node pools for Firecracker and Wasm workloads Admission controller policy to enforce /sandboxes runtime selection Pod security considerations when mixing runtimes in one cluster

Security and performance differ by tier; you’ll enforce them with RuntimeClasses, node pools, and admission policies that ensure each sandbox lands on the right infrastructure with consistent guardrails.929394959697


Security and performance tradeoffs by tier

gVisor tier

  • Security
    • Strength: Higher than plain containers (user-space kernel “Sentry” intercepts syscalls), reducing host-kernel attack surface.9893
    • Weakness: Still shares host kernel; not as strong as VM/microVM isolation if gVisor or host kernel are compromised.9598
    • Good use: Semi-trusted or internal multi-tenant workloads where container-level semantics are required.9395
  • Performance
    • Overhead: Syscall-heavy or I/O-heavy workloads see higher latency because every syscall passes through the user-space kernel.9993
    • Startup: Very close to container startup; negligible extra cold start vs runc.93

Kata Containers + Firecracker tier

  • Security
    • Strength: Each pod runs inside a lightweight VM with its own guest kernel, providing VM-grade, hardware-backed isolation.9610098
    • Good use: Untrusted/public code execution and strong tenant isolation scenarios.989596
  • Performance
    • Overhead: Higher memory and CPU overhead per sandbox vs gVisor; you pay for booting a microVM and guest kernel.10096
    • Startup: Faster than traditional VMs, but slower than containers; firecracker-containerd and pre-warmed VMs help mitigate cold start.101102

Wasmtime / Wasm tier

  • Security
    • Strength: Strong in-process isolation—linear memory and capability-based host APIs; no POSIX/host kernel surface unless explicitly exposed.10397
    • Weakness: Only covers code that can be compiled to Wasm/WASI; any unsafe host APIs you expose become the main attack surface.97103
  • Performance
    • Overhead: Extremely low cold start for simple functions (few–tens of ms vs tens–hundreds of ms for microVMs).103
    • Limitations: Complex workloads with large dependency graphs and heavy compute can see slower cold starts and execution than microVM-based setups.104103

Example RuntimeClass YAML for gVisor, Kata, and Wasm

gVisor RuntimeClass

Use runsc as handler; optionally restrict to sandboxes node pool via scheduling.nodeSelector.1051069295

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: gvisor
handler: runsc
scheduling:
nodeSelector:
coditect.io/node-profile: gvisor
tolerations:
- key: "coditect.io/sandbox"
operator: "Equal"
value: "gvisor"
effect: "NoSchedule"

Kata + Firecracker RuntimeClass

Use Kata handler configured to use Firecracker as VMM, and tie to a dedicated node pool.102107108105

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: kata-fc
handler: kata-fc
scheduling:
nodeSelector:
coditect.io/node-profile: kata-firecracker
tolerations:
- key: "coditect.io/sandbox"
operator: "Equal"
value: "kata-fc"
effect: "NoSchedule"

Wasm RuntimeClass (if using Wasm containers)

If you’re using a Wasm-aware containerd shim (per CNCF practice guide), define a RuntimeClass that points to the Wasm handler and node pool.10997

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: wasm-wasmtime
handler: wasmtime
scheduling:
nodeSelector:
coditect.io/node-profile: wasm
tolerations:
- key: "coditect.io/sandbox"
operator: "Equal"
value: "wasm"
effect: "NoSchedule"

If instead you use a Wasm executor Deployment (sidecar or service), you won’t need a RuntimeClass; the tier is enforced via your control plane.


Node pool configuration for Firecracker and Wasm workloads

Firecracker/Kata nodes

  • Label and taint nodes to ensure only Kata/Firecracker sandboxes land there:11092102
    • Labels: coditect.io/node-profile=kata-firecracker
    • Taints: coditect.io/sandbox=kata-fc:NoSchedule
  • Configure containerd on those nodes with kata-fc runtime pointing to Kata configured for Firecracker:107108102
    • containerd.toml with plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-fc.
  • Capacity planning: fewer but larger nodes, since microVM overhead per sandbox is higher; account for guest OS memory and disk.

Wasm nodes

  • Option A – Wasm runtimeClass: nodes with Wasm-aware container runtime:97109
    • Labels: coditect.io/node-profile=wasm
    • Taints: coditect.io/sandbox=wasm:NoSchedule
    • containerd configured with a wasmtime/wasmedge runtime handler.
  • Option B – Wasm executor pool: generic nodes running Wasm executor pods.
    • Use node labels for CPU-optimized nodes (node.kubernetes.io/instance-type=c3-highcpu) and schedule Wasm executors there.97

For both tiers, you keep compute isolation by not mixing high-risk runtimes with general workloads on the same nodes.


Admission controller policy for enforcing /sandboxes runtime

Implement a ValidatingAdmissionWebhook (plus optional MutatingAdmissionWebhook) that:

  1. Only allows Coditect sandbox controller to set runtimeClassName.
  2. Enforces mapping between sandbox labels/annotations and RuntimeClass.94111112113

Mutating webhook (optional)

If devs create Pods directly (internal tools), you can auto-set RuntimeClass based on labels or namespace:

  • On Pod CREATE:
    • If label coditect.io/sandbox-runtime=gvisor, set .spec.runtimeClassName = "gvisor".
    • Same for kata-fc and wasm-wasmtime.

Validating webhook (strong enforcement)

  • Reject pods that:
    • Have runtimeClassName set but lack coditect.io/sandbox-id label (prevents arbitrary use of Kata/gVisor).113
    • Use a RuntimeClass that doesn’t match allowed values for the namespace/tenant.
  • Ensure nodeSelector compatibility: intersect node selectors with the chosen RuntimeClass’s scheduling nodeSelector, reject on conflict.9294

This guarantees only the Coditect control plane can allocate expensive, high-isolation runtimes and binds runtime selection to your /sandboxes policy engine.


Pod security considerations when mixing runtimes

When multiple runtimes share a cluster, lock down Pod specs and namespaces so that invariants hold across all tiers.1149598

  • Pod Security / SecurityContext
    • Enforce non-root, drop all capabilities, no privileged or hostPath mounts, and no host network for all sandbox pods via Namespace-level Pod Security Standards or OPA/Gatekeeper/ValidatingAdmissionPolicy.114
    • Disallow hostPID, hostIPC, and hostNetwork for sandbox pods.95114
  • Namespace isolation
    • Use dedicated namespaces per tenant or per tier (e.g., coditect-gvisor, coditect-kata, coditect-wasm).
    • Apply tighter NetworkPolicies and ResourceQuotas per namespace.114
  • Runtime isolation
    • Do not allow regular workloads to use high-security RuntimeClasses; enforce via admission.113
    • Ensure runtime-specific node pools are not running cluster control plane components or shared databases to reduce blast radius.
  • Observability and secrets
    • Use sidecarless logging (node-level collectors) to avoid sharing sidecars between runtimes.115
    • Scope Kubernetes secrets to namespaces and bind them to specific service accounts used by sandbox pods; enforce via RBAC and admission.111114

If you want, the next iteration can be a concrete ValidatingAdmissionPolicy or Gatekeeper constraint set that encodes: “only Coditect controller service account may create pods with runtimeClassName in {gvisor, kata-fc, wasm-wasmtime}, and all such pods must have sandbox labels and non-privileged securityContext.” 116


WHICH OF THESE COULD RUN MULTI-USER MULTI-TENANT MULTI-TEAM MULTI-PROJECT WORKFLOWS on GOOGLE CLOUD WORKSTATIONS economically?

For multi-user / multi-tenant / multi-team workflows on Google Cloud Workstations, the most economical and operationally straightforward option from your list is gVisor-based containers, with Wasm as a niche add‑on for specific tools; Kata+Firecracker is possible but significantly more complex and costly on GCP.117118119120121122


gVisor on GCP (best economic fit)

  • Google already runs gVisor at scale for Cloud Run, Cloud Functions, App Engine, and GKE Sandbox, which shows it works well for arbitrary user workloads with good compatibility.118123119117
  • Cloud Workstations are just GCE VMs; you can run runsc there or on a companion GKE cluster, giving each user/team “nested” sandboxes without extra hypervisor layers.124122
  • No need for nested virtualization; gVisor works on VMs or bare metal and is explicitly documented as “cloud ready” for all major providers.122

This makes gVisor the most economical and least operationally risky choice for embedding many Coditect sandboxes into a shared pool of Google Cloud Workstations (per-team dev VMs), while still adding a strong isolation boundary around untrusted agent code.119117118122


Kata Containers + Firecracker on GCP (possible, but heavier)

  • Kata requires hardware virtualization (KVM) in the underlying node; on GKE/GCE, you must customize node images or enable nested virtualization to run Kata+Firecracker, which is not available by default.120121125
  • Northflank’s case study notes extra engineering effort to provision Kata on Google Cloud because standard GKE nodes don’t expose KVM; they built tooling for custom images with virtualization enabled.120
  • Firecracker on GKE has been demonstrated by running Kata Containers with Firecracker as the VMM, but again this relies on KVM and custom configuration on GCE nodes.121126127

So while Kata+Firecracker can give you strong isolation for some high-risk multi-tenant workflows, it is less economical and more complex than gVisor on standard Workstations/VMs, and you’d likely reserve it for a smaller pool of critical workloads rather than the general multi-user Workstations fleet.128125121120


Wasm runtimes on GCP (targeted, very cheap per-tool)

  • GCP already exposes Wasm at the edge via Service Extensions (GCPWasmPlugin) and Artifact Registry‑hosted Wasm modules for load balancers.129130
  • You can also run WasmEdge/Wasmtime inside containers on GCE/GKE/Workstations, which is light-weight and doesn’t require nested virtualization; good for per-tenant “tools” rather than full dev workflows.131132133

Economically, Wasm is ideal for capability-scoped tools (linting, small transforms) that Coditect calls from Workstations, but not for full multi-tenant Linux workflows (no full distro, limited POSIX).133134131


Practical recommendation for Coditect on Google Cloud Workstations

  • Use gVisor as the default “Modal-like” sandbox layer inside or alongside Workstations:
    • Run a gVisor-enabled container runtime (runsc) on shared GCE nodes or a GKE cluster attached to the same VPC as Workstations.117119124
    • Treat each user/team/project workflow as a gVisor sandbox, with multi-tenant isolation and quotas enforced at the Coditect layer.
  • Optionally add:
    • Wasm runtimes for scoped tools and plugins (extremely cheap, safe, and portable).130129133
    • A small, specialized Kata+Firecracker node pool if you later need VM-grade isolation for the most sensitive multi-tenant workloads and are willing to handle nested virtualization and custom images on GCE.125121120

If you want, next step could be: “Design a Coditect + Google Cloud Workstations reference architecture using gVisor for in-VM sandboxes (no Kubernetes), including how to provision per-team Workstations, run nested containers with runsc, and enforce tenant/project quotas.” 135136


Design a Coditect + Google Cloud Workstations reference architecture using gVisor for in-VM sandboxes (no Kubernetes), including how to provision per-team Workstations, run nested containers with runsc, and enforce tenant/project quotas. Practical recommendation for Coditect on Google Cloud Workstations

Use gVisor as the default “Modal-like” sandbox layer inside or alongside Workstations: Run a gVisor-enabled container runtime (runsc) on shared GCE nodes or a GKE cluster attached to the same VPC as Workstations.​ Treat each user/team/project workflow as a gVisor sandbox, with multi-tenant isolation and quotas enforced at the Coditect layer.

Use Google Cloud Workstations as “fat agent hosts” and run gVisor (runsc) inside each VM to provide a Modal‑like sandbox for multi-tenant workflows, with Coditect as the control plane enforcing policies and quotas.137138139140141


High-level architecture

  • Per-team Workstations
    • Each team gets one or more Google Cloud Workstations (backed by GCE VMs) in a shared “Coditect Workstations” project/VPC.
    • Workstations run Docker or containerd configured with runsc as an additional OCI runtime for sandboxed workloads.138139140
  • In-VM gVisor sandboxes
    • Agent code and tools run inside containers launched with the runsc runtime (not plain runc), giving a gVisor user-space kernel boundary inside the Workstation VM.141137138
    • Each Coditect sandbox = one gVisor container, with per-sandbox CPU/mem/time limits applied via cgroups and Coditect’s control plane.
  • Coditect control plane (central services)
    • Hosted on GCE or GKE in the same VPC; exposes /sandboxes API, manages Workstation registration, scheduling, quotas, and audit logs.
    • Workstations run a Coditect agent that pulls/receives sandbox tasks, launches runsc containers, streams logs, and reports resource usage.

This avoids Kubernetes entirely for the inner sandboxing, leveraging gVisor’s “runs anywhere existing container tooling does” property.139138141


Provisioning per-team Workstations with gVisor

  1. Base image / template
    • Start from a Linux Workstation image (e.g., Container‑Optimized OS or Ubuntu with Docker preinstalled).
    • Install gVisor runsc following the official installation guide: apt-get install -y runsc or by using the install script; then run runsc install to integrate with Docker/containerd.140139
  2. Docker/containerd config
    • Add runsc as a runtime in /etc/docker/daemon.json (or containerd config):138139140
{
"runtimes": {
"runsc": {
"path": "/usr/bin/runsc"
}
},
"default-runtime": "runc"
}
- Restart Docker/Containerd. After this, `docker run --runtime=runsc ...` will launch a gVisor sandbox container.[^9_3][^9_4][^9_2]

3. Workstation registration with Coditect - On first boot, a Coditect agent on the Workstation: - Registers itself to Coditect control plane with metadata (team, tenant, capabilities, vCPU/RAM). - Opens a secure gRPC/WebSocket connection for task dispatch and health reporting. 4. Per-team isolation - Map Workstations to tenants/teams using labels and IAM (e.g., each Workstation has a Coditect “worker_id” and “team_id”). - Optionally run multiple tenants per Workstation, but rely on gVisor sandboxes + user-level ACLs for separation.141138


Running nested containers with runsc (Modal-like behavior)

On each Workstation, the Coditect agent:

  1. Receives a CreateSandbox RPC with: tenant, project, image, resources, and network policy.
  2. Executes a gVisor container:
docker run \
--runtime=runsc \
--cpus=1.0 \
--memory=1g \
--read-only \
--network=none \
--name coditect-sb-$SANDBOX_ID \
-v /workspaces/$TEAM/$PROJECT:/workspace:rw \
ghcr.io/coditect/agent-runtime:latest \
sleep infinity
- `runsc` enforces a user-space kernel boundary inside the VM.[^9_1][^9_2][^9_5]
- Use per-sandbox volumes for project data; keep container rootfs ephemeral.

3. For exec operations, the agent uses docker exec against the running gVisor container to run commands and stream stdout/stderr back to the Coditect control plane. 4. On destroy, the agent stops and removes the container and cleans up any ephemeral volumes or scratch space.

This is the same pattern Google documentation suggests for “run untrusted binaries with gVisor inside your own container infrastructure.”137138


Tenant/project quota enforcement

Coditect’s control plane maintains a quota service and metadata DB independent of Workstations:

  • Per-tenant and per-project quotas
    • Max concurrent sandboxes.
    • vCPU and memory budgets (e.g., vCPU‑seconds, GiB‑hours) per time window.
    • Optional GPU quotas (if Workstations have GPUs and gVisor GPU access is configured).142143
  • Lifecycle flow
  1. Client calls POST /sandboxes.
  2. Control plane checks quotas in central DB; if OK, allocates a sandbox ID and assigns it to a Workstation with available capacity.
  3. Workstation agent launches the runsc container and periodically reports usage (CPU, mem, wall time).
  4. On completion or timeout, agent reports final metrics; control plane decrements quota and writes an audit event.
  • Enforcement
    • Control plane will refuse new sandboxes when quotas are exceeded.
    • Agents enforce per-container CPU/memory limits (cgroups) and kill long‑running sandboxes after their allowed time.

This keeps quotas consistent across all Workstations and decouples enforcement from individual VMs.


Security and isolation notes

  • gVisor boundary
    • gVisor intercepts syscalls via runsc, providing an extra kernel between sandboxed workloads and the Workstation’s host kernel.144138141
    • Appropriate for running arbitrary code from multiple tenants and teams with better protection than plain containers, but not as strong as full microVM isolation.143144138
  • VM boundary (Workstation)
    • Each Workstation is itself a GCE VM, so you also have a hypervisor boundary separating teams/workflows across different Workstations.
    • For especially sensitive tenants, run them on dedicated Workstations instead of mixing them on shared ones.
  • OS hardening
    • Run minimal OS images, restrict SSH/users, and treat Workstations as “managed worker nodes” controlled by Coditect automation.

Practical Coditect recommendation on Google Cloud Workstations

  • Default: Use gVisor (runsc) inside Workstations as the default sandbox runtime for agent workflows:
    • Good enough isolation for multi-user/multi-project use, with much lower cost and complexity than microVM solutions, and fully supported by Google’s stack.144138141
  • Control plane: Central Coditect service manages /sandboxes, quotas, routing to Workstations, and logs/metrics.
  • Workstation agents: Lightweight daemon per VM that:
    • Launches runsc containers.
    • Streams outputs and metrics.
    • Enforces local CPU/mem/time limits and cleanup.

If you want, next step can be:

“Draft a minimal gRPC API between Coditect control plane and the Workstation agent for creating, executing, and destroying gVisor sandboxes, including streaming stdout/stderr and resource usage metrics.” 145146147148149150151


Draft a minimal gRPC API between Coditect control plane and the Workstation agent for creating, executing, and destroying gVisor sandboxes, including streaming stdout/stderr and resource usage metrics DESIGN multi-user individual user data persistence, individual projects backed up to github in near real time as development proceeds in a project management CODITECT environment running on top of Anthropic CLAUDE CODE and GOOGLE GEMINIE

A minimal control-plane↔Workstation gRPC API plus project-scoped Git workflows lets Coditect run multi-user sandboxes on Workstations while persisting each user’s work to GitHub in near–real time.152153154155156


1. Minimal gRPC API (control plane ↔ agent)

Service definitions (proto sketch)

syntax = "proto3";

package coditect.sandbox.v1;

service SandboxAgent {
rpc CreateSandbox(CreateSandboxRequest) returns (CreateSandboxResponse);
rpc ExecSandbox(stream ExecRequest) returns (stream ExecEvent);
rpc DestroySandbox(DestroySandboxRequest) returns (DestroySandboxResponse);
rpc ReportMetrics(stream MetricsReport) returns (MetricsAcknowledge);
}

// --- Core messages ---

message CreateSandboxRequest {
string sandbox_id = 1;
string tenant_id = 2;
string project_id = 3;
string image = 4;
repeated string command = 5; // optional initial command
double cpu_limit = 6; // cores
int64 memory_limit_mib = 7;
int64 idle_timeout_seconds = 8;
string workspace_path = 9; // e.g. /workspaces/tenant/project/user
map<string,string> env = 10;
}

message CreateSandboxResponse {
string sandbox_id = 1;
string container_id = 2;
string status = 3; // CREATED | FAILED
string error_message = 4;
}

message ExecRequest {
string sandbox_id = 1;
oneof payload {
ExecStart start = 2;
ExecStdin stdin = 3;
ExecControl control = 4;
}
}

message ExecStart {
repeated string command = 1;
string workdir = 2;
int64 timeout_seconds = 3;
bool pty = 4;
}

message ExecStdin {
bytes data = 1;
}

message ExecControl {
bool terminate = 1;
}

message ExecEvent {
string sandbox_id = 1;
oneof event {
ExecStdout stdout = 2;
ExecStderr stderr = 3;
ExecExit exit = 4;
ExecStatus status = 5;
}
}

message ExecStdout { bytes data = 1; }
message ExecStderr { bytes data = 1; }
message ExecExit {
int32 exit_code = 1;
string reason = 2; // NORMAL | TIMEOUT | KILLED
}
message ExecStatus {
string phase = 1; // STARTING | RUNNING | DONE | ERROR
string message = 2;
}

message DestroySandboxRequest {
string sandbox_id = 1;
bool force = 2;
}

message DestroySandboxResponse {
string sandbox_id = 1;
string status = 2; // TERMINATED | NOT_FOUND | ERROR
}

// --- Metrics streaming ---

message MetricsReport {
string sandbox_id = 1;
string tenant_id = 2;
string project_id = 3;
int64 timestamp_unix_ms = 4;
double cpu_seconds = 5;
int64 memory_peak_mib = 6;
int64 bytes_read = 7;
int64 bytes_written = 8;
}

message MetricsAcknowledge {
bool ok = 1;
}
  • CreateSandbox → agent does docker run --runtime=runsc ... on the Workstation.154155156
  • ExecSandbox is bidirectional streaming: control plane sends start/stdin, agent streams stdout/stderr/exit/status.153157152
  • ReportMetrics is a long-lived stream where the agent periodically pushes resource usage (derived from cgroups and container stats) for quota/billing.158159160

2. Agent behavior on Workstation (gVisor inside VM)

  • CreateSandbox:
    • docker run --runtime=runsc with CPU/mem limits and mounted user workspace.156161154
    • Enforce network=none or egress-locked config for untrusted code.
  • ExecSandbox (stream):
    • On ExecStart, run docker exec (optionally with a PTY) and hook process stdout/stderr to the gRPC stream back to the control plane.157152153
    • On ExecStdin, write to process stdin.
    • On timeout or ExecControl.terminate, kill the process and report ExecExit with reason.
  • Metrics:
    • Use docker stats/cgroup FS to sample per-container CPU/bytes/memory and send MetricsReport every N seconds.159160158

3. Multi-user data model and persistence

Workspace layout (per user / project)

On each Workstation VM:

- Root: `/workspaces/<tenant>/<user>/<project>/`  
  • src/ – working tree checked out from GitHub.
  • .coditect/ – agent metadata, run logs, temp artifacts.
  • venv/ or envs/ – optional per-project deps.

A gVisor sandbox mounts this path into the container:

docker run --runtime=runsc \
-v /workspaces/$TENANT/$USER/$PROJECT:/workspace \
-w /workspace/src \
ghcr.io/coditect/agent-runtime:latest \
sleep infinity

Each sandbox therefore operates directly on the user’s Git checkout, so Git becomes the persistence boundary.

Git + GitHub near–real time

In each /workspace/src:

  • Initialize Git with origin GitHub repo (per project).
  • Run a small Coditect sync daemon (inside Workstation or as part of the Coditect agent) that:
    • Watches for filesystem changes (inotify) and triggers:
- `git add` + `git commit` with machine-generated messages on a **shadow branch** (`coditect/autosave/<user>/<date>`).  
    - `git push` to GitHub over a PAT or GitHub App installation token.
- Batches frequent changes (e.g., commit/push every 15–30 seconds when there are changes).

This yields “near–real time” backups:

  • If a user’s Workstation dies, their work since last commit exists in GitHub on the autosave branch.
  • Human commits (PRs) remain under user control; the autosave branch can be squashed or used for recovery.

You can optionally run a centralized backup job that mirrors GitHub org repos to object storage or another Git remote for compliance.162163164


4. Coditect project management + agents (Claude/Gemini)

Project state

In the Coditect control plane:

  • TenantTeamUserProject.
  • For each project:
    • GitHub repo URL + default branch.
    • Mapping to one or more Workstations and gVisor sandboxes.
    • Current “session” state: active sandboxes, last Claude/Gemini runs, metrics, etc.

AI-assisted workflows

  • Claude Code and Google Gemini operate on project state stored in GitHub and metadata from Coditect:
    • When a user requests “refactor module X”, Coditect:
      • Ensures project workspace is up to date (Git pull).
      • Calls Claude Code / Gemini with repo snapshot + context.
      • Writes generated changes directly into the workspace.
      • gVisor sandbox runs tests / linters; outputs streamed via ExecSandbox.
      • Autosave daemon captures changes and pushes to GitHub autosave branch.
  • Coditect stores run metadata (prompt, models, files touched, test results) in .coditect/ and central DB, linking to Git commit hashes so all AI-driven changes are auditable.

5. Quotas, multi-tenancy, and economics

  • Quotas in control plane
    • Per tenant/project: max concurrent sandboxes, max vCPU/memory, and daily CPU‑seconds.
    • Per user: sandbox count and total runtime to prevent abuse.
  • Enforcement
    • Before sending CreateSandbox over gRPC, Coditect checks quotas and only dispatches if under limit.
    • MetricsReport streams feed usage accounting; Coditect updates counters and can proactively kill/deny new sandboxes if limits exceeded.160158159
  • Economics
    • On GCP, you pay primarily for Workstation VMs; gVisor just adds software isolation inside those VMs, which is cheaper than microVM stacks while still significantly improving security for untrusted code.155165166154

If you want next, a focused proto + Go server stub for SandboxAgent (with docker/runsc integration) or a concrete Git autosave strategy (branch naming, commit messages, conflict handling) can be sketched. 167168169170171172


I want next, a focused proto + Go server stub for SandboxAgent (with docker/runsc integration) or a concrete Git autosave strategy (branch naming, commit messages, conflict handling) can be sketched. Show a protobuf definition for the control plane API messages and services

Example Go server and agent implementation for the gRPC API Authentication and authorization patterns for this control plane How to stream stdout/stderr reliably over unstable networks Best practices for gVisor sandbox lifecycle management

A minimal, production-friendly design is: a protobuf-defined control-plane API, a Go agent that wraps docker run --runtime=runsc, mTLS for authn/z, streaming with backpressure and resume semantics, and disciplined sandbox lifecycle policies.173174175176177178


1. Protobuf: control plane ↔ agent API

syntax = "proto3";

package coditect.sandbox.v1;

option go_package = "github.com/coditect/platform/sandboxpb";

// Control plane <-> workstation agent
service SandboxAgent {
// One-shot create
rpc CreateSandbox(CreateSandboxRequest) returns (CreateSandboxResponse);

// Bidirectional exec stream: control plane sends start/stdin/control;
// agent streams stdout/stderr/status/exit events.
rpc ExecSandbox(stream ExecRequest) returns (stream ExecEvent);

// Destroy sandbox container + cleanup
rpc DestroySandbox(DestroySandboxRequest) returns (DestroySandboxResponse);

// Telemetry stream from agent to control plane
rpc ReportMetrics(stream MetricsReport) returns (MetricsAcknowledge);
}

// ---------- Create / Destroy ----------

message CreateSandboxRequest {
string sandbox_id = 1;
string tenant_id = 2;
string project_id = 3;
string image = 4;
string workspace_path= 5; // host path in workstation VM
double cpu_limit = 6; // cores
int64 memory_limit_mib = 7;
int64 idle_timeout_seconds = 8;
map<string,string> env = 9;
}

message CreateSandboxResponse {
string sandbox_id = 1;
string container_id= 2;
string status = 3; // CREATED | FAILED
string error = 4;
}

message DestroySandboxRequest {
string sandbox_id = 1;
bool force = 2;
}

message DestroySandboxResponse {
string sandbox_id = 1;
string status = 2; // TERMINATED | NOT_FOUND | ERROR
string error = 3;
}

// ---------- Exec streaming ----------

message ExecRequest {
string sandbox_id = 1;
oneof payload {
ExecStart start = 2;
ExecStdin stdin = 3;
ExecControl control = 4;
}
}

message ExecStart {
repeated string command = 1;
string workdir = 2;
int64 timeout_seconds = 3;
bool pty = 4;
}

message ExecStdin {
bytes data = 1;
}

message ExecControl {
bool terminate = 1;
}

message ExecEvent {
string sandbox_id = 1;
oneof event {
ExecStdout stdout = 2;
ExecStderr stderr = 3;
ExecExit exit = 4;
ExecStatus status = 5;
}
}

message ExecStdout { bytes data = 1; }
message ExecStderr { bytes data = 1; }

message ExecExit {
int32 exit_code = 1;
string reason = 2; // NORMAL | TIMEOUT | KILLED | INTERNAL_ERROR
}

message ExecStatus {
string phase = 1; // STARTING | RUNNING | DONE | ERROR
string message = 2;
}

// ---------- Metrics streaming ----------

message MetricsReport {
string sandbox_id = 1;
string tenant_id = 2;
string project_id = 3;
int64 timestamp_unix_ms = 4;
double cpu_seconds = 5;
int64 memory_peak_mib = 6;
int64 bytes_read = 7;
int64 bytes_written = 8;
}

message MetricsAcknowledge {
bool ok = 1;
string error = 2;
}
  • Shapes match gRPC streaming best practices (bidirectional Exec, client-stream metrics).179180181173

2. Go agent skeleton with docker + runsc

Below is a condensed agent implementation sketch using Go gRPC; it assumes Docker is configured with a runsc runtime.182175178183

// go:build workstation-agent

package main

import (
"context"
"io"
"log"
"net"
"os/exec"
"time"

"google.golang.org/grpc"
"google.golang.org/grpc/credentials"
"google.golang.org/grpc/peer"

pb "github.com/coditect/platform/sandboxpb"
)

type agentServer struct {
pb.UnimplementedSandboxAgentServer
}

func (s *agentServer) CreateSandbox(ctx context.Context, req *pb.CreateSandboxRequest) (*pb.CreateSandboxResponse, error) {
containerName := "coditect-sb-" + req.SandboxId

args := []string{
"run", "-d",
"--runtime=runsc", // gVisor runtime [web:116][web:142]
"--cpus", formatCPU(req.CpuLimit),
"--memory", formatMem(req.MemoryLimitMib),
"--name", containerName,
"--network", "none",
"--read-only",
"-v", req.WorkspacePath + ":/workspace",
}
for k, v := range req.Env {
args = append(args, "-e", k+"="+v)
}
args = append(args, req.Image, "sleep", "infinity")

cmd := exec.CommandContext(ctx, "docker", args...)
out, err := cmd.CombinedOutput()
if err != nil {
return &pb.CreateSandboxResponse{
SandboxId: req.SandboxId,
Status: "FAILED",
Error: string(out),
}, nil
}

return &pb.CreateSandboxResponse{
SandboxId: req.SandboxId,
ContainerId: containerName,
Status: "CREATED",
}, nil
}

func (s *agentServer) ExecSandbox(stream pb.SandboxAgent_ExecSandboxServer) error {
// Receive first message with ExecStart
first, err := stream.Recv()
if err != nil {
return err
}
start := first.GetStart()
if start == nil {
return io.EOF
}
sandboxID := first.SandboxId
containerName := "coditect-sb-" + sandboxID

ctx := stream.Context()
cmdArgs := append([]string{"exec"}, containerName)
if start.Pty {
cmdArgs = append(cmdArgs, "-it")
}
cmdArgs = append(cmdArgs, start.Command...)

cmd := exec.CommandContext(ctx, "docker", cmdArgs...)
stdin, _ := cmd.StdinPipe()
stdout, _ := cmd.StdoutPipe()
stderr, _ := cmd.StderrPipe()

if err := cmd.Start(); err != nil {
stream.Send(&pb.ExecEvent{
SandboxId: sandboxID,
Event: &pb.ExecEvent_Status{
Status: &pb.ExecStatus{
Phase: "ERROR",
Message: err.Error(),
},
},
})
return nil
}

// Goroutine: forward stdout
go func() {
buf := make([]byte, 32*1024)
for {
n, readErr := stdout.Read(buf)
if n > 0 {
_ = stream.Send(&pb.ExecEvent{
SandboxId: sandboxID,
Event: &pb.ExecEvent_Stdout{
Stdout: &pb.ExecStdout{Data: buf[:n]},
},
})
}
if readErr != nil {
return
}
}
}()

// Goroutine: forward stderr
go func() {
buf := make([]byte, 32*1024)
for {
n, readErr := stderr.Read(buf)
if n > 0 {
_ = stream.Send(&pb.ExecEvent{
SandboxId: sandboxID,
Event: &pb.ExecEvent_Stderr{
Stderr: &pb.ExecStderr{Data: buf[:n]},
},
})
}
if readErr != nil {
return
}
}
}()

// Receive stdin/control
for {
msg, recvErr := stream.Recv()
if recvErr == io.EOF {
break
}
if recvErr != nil {
break
}
switch p := msg.Payload.(type) {
case *pb.ExecRequest_Stdin:
if _, err := stdin.Write(p.Stdin.Data); err != nil {
// ignore, process might be done
}
case *pb.ExecRequest_Control:
if p.Control.Terminate {
_ = cmd.Process.Kill()
}
}
}

waitErr := cmd.Wait()
exitCode := 0
reason := "NORMAL"
if waitErr != nil {
// Map to exit code and reason...
reason = "ERROR"
}
return stream.Send(&pb.ExecEvent{
SandboxId: sandboxID,
Event: &pb.ExecEvent_Exit{
Exit: &pb.ExecExit{
ExitCode: int32(exitCode),
Reason: reason,
},
},
})
}

func (s *agentServer) DestroySandbox(ctx context.Context, req *pb.DestroySandboxRequest) (*pb.DestroySandboxResponse, error) {
containerName := "coditect-sb-" + req.SandboxId
cmd := exec.CommandContext(ctx, "docker", "rm", "-f", containerName)
out, err := cmd.CombinedOutput()
if err != nil {
return &pb.DestroySandboxResponse{
SandboxId: req.SandboxId,
Status: "ERROR",
Error: string(out),
}, nil
}
return &pb.DestroySandboxResponse{
SandboxId: req.SandboxId,
Status: "TERMINATED",
}, nil
}

func (s *agentServer) ReportMetrics(stream pb.SandboxAgent_ReportMetricsServer) error {
for {
report, err := stream.Recv()
if err == io.EOF {
return stream.SendAndClose(&pb.MetricsAcknowledge{Ok: true})
}
if err != nil {
return err
}
// Write metrics to local exporter / forward to control plane, etc.
_ = report
}
}

func main() {
creds, err := credentials.NewServerTLSFromFile("agent-cert.pem", "agent-key.pem")
if err != nil {
log.Fatal(err)
}
server := grpc.NewServer(grpc.Creds(creds), authInterceptor())
pb.RegisterSandboxAgentServer(server, &agentServer{})

lis, err := net.Listen("tcp", ":8443")
if err != nil {
log.Fatal(err)
}
log.Println("Sandbox agent listening on :8443")
if err := server.Serve(lis); err != nil {
log.Fatal(err)
}
}

func authInterceptor() grpc.ServerOption {
return grpc.UnaryInterceptor(func(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
if p, ok := peer.FromContext(ctx); ok && p.AuthInfo != nil {
// Check mTLS cert SANs / SPIFFE IDs here.
}
return handler(ctx, req)
})
}

func formatCPU(cpu float64) string { return fmt.Sprintf("%.2f", cpu) }
func formatMem(mib int64) string { return fmt.Sprintf("%dMi", mib) }

3. Authentication and authorization patterns

  • mTLS for service-to-service
    • Use gRPC with TLS and mutual TLS; control-plane and agents each present X.509 certs.185174
    • Encode identity in certificate SAN / SPIFFE ID (e.g., spiffe://coditect/workstation/<id>).
    • Agent only accepts connections from certs signed by Coditect CA with appropriate SAN prefix; control plane similarly validates agents.174186185
  • Per-call metadata (RBAC)
    • Control plane includes tenant_id, project_id, and sandbox_id in gRPC metadata.
    • Agent uses cert identity + metadata to validate that the caller is allowed to manage that sandbox on that workstation.
  • Least privilege
    • Only Coditect control plane uses SandboxAgent API; user sessions never talk directly to Workstations.
    • Per-tenant quotas and allowed operations enforced at control-plane before calling agent.

4. Streaming stdout/stderr over unstable networks

Patterns from gRPC streaming best practices:181184187173180

  • Chunked messages
    • Send stdout/stderr as small chunks (e.g., 32 KiB) in ExecStdout/ExecStderr messages to avoid head-of-line blocking and huge messages.
  • Application-level sequence numbers
    • Optionally add int64 seq to ExecStdout/ExecStderr and ExecRequest to detect gaps or duplicate messages if the control plane reconnects or needs to reconstruct logs.
  • Backpressure
    • Use gRPC flow-control; avoid buffering unbounded data server-side. If client is slow, writes will block, naturally throttling output.184181
  • Reconnect + resume
    • Treat Exec streams as long-lived but restartable sessions:
      • If connection drops, control plane can reconnect and reattach to the same sandbox in “logs-only” mode, starting from last-seen sequence number.
      • For reliability, also persist logs on disk (/workspace/.coditect/logs/<sandbox>.log) and expose a “download logs” API as a fallback.
  • Heartbeat messages
    • Periodic ExecStatus messages (e.g., every 5–10 seconds) let the control plane detect broken connections and mark runs as stale.

5. gVisor sandbox lifecycle best practices

Based on gVisor docs and production usage patterns:188176177189190191

  • Immutable, minimal images
    • Build small base images for agent workloads; avoid writing to container root, use workspace volume for project data.190188
  • Strict resource limits
    • Always set CPU and memory limits on docker run --runtime=runsc; gVisor handles resource isolation via underlying cgroups.192193190
    • Apply wall-clock and idle timeouts; agent enforces timeouts and cleans up containers.
  • Network minimization
    • Use --network=none or minimal egress (proxy) for untrusted code, since many LLM-generated code exploits involve network exfiltration.194195
  • Lifecycle hooks
    • Define states: CREATING -> RUNNING -> IDLE -> TERMINATING -> TERMINATED, and enforce transitions centrally.
    • Periodically reconcile: list Docker containers with coditect-sb-* prefix, and clean up orphaned or long-idle sandboxes.
  • Observability
    • Expose per-sandbox metrics (CPU, memory, I/O) via MetricsReport, and integrate with a global metrics backend for alerts (e.g., runaway sandboxes, OOM loops).193192
  • Compatibility testing
    • Some syscalls and kernel features behave differently under gVisor; test typical workloads and frameworks (Python, Node, build tools) under runsc early.191196188190

If you want, next prompt could be:

“Sketch the Git autosave strategy: exact branch naming scheme, commit-message conventions, and conflict-resolution flow for Coditect autosave branches pushed from Workstations to GitHub.” 197198199200


Sketch the Git autosave strategy: exact branch naming scheme, commit-message conventions, and conflict-resolution flow for Coditect autosave branches pushed from Workstations to GitHub

Use autosave as a separate, machine-owned history that never collides with human feature branches, and treat it as an append-only stream that can be squashed into clean commits later.201202203


Branch naming scheme

Goal: globally unique, machine-owned, easy to query by tenant/user/project and date.

  • Pattern
    • autosave/<tenant-slug>/<project-slug>/<user-slug>/<yyyymmdd>
  • Examples
    • autosave/acme/payments/api-user123/20260114
    • autosave/corp-ml/llm-router/alice/20260114

Properties:

  • Uses slash-separated hierarchy per best practices for descriptive branch names.204205206207
  • Rotates daily so branches don’t grow unbounded; you can archive old days or squash them.
  • Automatically indicates ownership (tenant, project, user, date).

Implementation details:

  • Slugify: lowercase, [^a-z0-9-]-, truncate to length bounds to avoid absurd branch names.205206208

Commit-message conventions

Autosave commits should be clearly machine-generated, compact, and self-describing.

  • Subject line pattern
- `autosave: <short summary> [<user>@<timestamp>]`  
  • Short summary
    • “workspace snapshot”, or
    • “edited (+N, -M lines)” if cheap to compute.

Examples:

  • autosave: workspace snapshot [alice@2026-01-14T04:05:12Z]
  • autosave: edited payment_service.py (+32 -8) [bob@2026-01-14T04:06:45Z]

Guidelines:

  • Keep subject ≤ 72 chars for readability.203209201
  • No body text unless you want to embed a JSON summary blob (file list, tools used), which can be parsed by Coditect later.
  • Consider a fixed prefix autosave: so automation can distinguish these commits from human ones.210203

Autosave workflow and frequency

Inside the Workstation (per user/project):

  1. Tracking changes
    • Watch /workspace/src for file changes (inotify) or poll git status --porcelain every N seconds.
  2. Autosave trigger
    • If there are uncommitted changes and no Git operation in progress (.git/index.lock absent), and last autosave > 15–30s ago, then autosave.
  3. Autosave algorithm
# Pseudocode
git status --porcelain
if dirty:
branch="autosave/<tenant>/<project>/<user>/<yyyymmdd>"
git fetch origin
git checkout -B "$branch" "origin/$branch" || git checkout -b "$branch"
git add -A
git commit -m "autosave: workspace snapshot [user@timestamp]"
git push origin "$branch"
- This keeps autosave isolated while allowing rebasing/merging from the main feature branch into autosave when needed.

4. User feature branches - Users develop on normal branches like feature/PROJ-123-new-api.207205 - A Coditect “promote” action later merges/squashes autosave changes into the user’s feature branch via PR or manual cherry-picking.


Conflict-resolution flow

Key principle: never rewrite autosave public history; treat it as an append-only log to avoid breaking in-flight Workstation pushes.202211

1. Push conflicts (non-fast-forward)

When autosave git push origin autosave/... fails due to non-FF:

  • The autosave daemon:
    • Runs git fetch origin autosave/....
    • Rebases local autosave branch onto the remote:
git fetch origin autosave/...
git rebase origin/autosave/... # safe if only automation writes here
git push origin autosave/...
  • Because only automation writes these branches, controlling rebase via tooling is acceptable; no human consumers rely on autosave’s exact history.211202

If rebase fails (e.g., due to manual edits):

  • Mark the autosave branch as dirty in Coditect metadata and stop autosaving until a resolver job or a human cleans it up.

2. Divergence from main branches

Autosave branches will drift from main / feature/*:

  • Periodically (or when user wants to “promote” changes), Coditect:
    • Checks out the user’s feature branch.
    • Merges or cherry-picks from the autosave branch:
      • Squash merge to produce one coherent commit:
git checkout feature/PROJ-123
git merge --squash autosave/acme/payments/api-user123/20260114
git commit -m "feat: integrate autosave work from 2026-01-14"
    - This avoids dragging noisy machine-generated history into the main branch.[^12_1][^12_2]
  • After promotion, autosave continues accumulating; the next day rotates into a new .../20260115 branch.

3. Handling local uncommitted changes when switching branches

Autosave logic should not change the current checked-out branch in a way that disturbs the user:

  • Preferred pattern:
    • Keep user working on their feature branch.
    • Use separate clone or worktree for autosave if you need to track autosave branches explicitly, or:
    • Autosave directly on the current branch to a private remote (mirrored) and only push consolidated snapshots to GitHub branches.211

A practical compromise:

  • Treat autosave as branchless local commits in a hidden remote:
    • Local autosave repository pointing to a “shadow” remote or ref namespace (refs/autosave/...) and mirroring that to GitHub.211

But for simplicity in first iteration, using standard branches with automation allowed to re-checkout inside a detached worktree is usually sufficient.


Summary shape Coditect can adopt

  • Branch naming:
- `autosave/<tenant>/<project>/<user>/<yyyymmdd>`  
  • Commit messages:
    • autosave: workspace snapshot [user@ISO8601]
  • Autosave frequency:
    • Every 15–30 seconds of idle time with dirty working tree, batched and pushed to GitHub.
  • Conflict handling:
    • Automated rebase of autosave branches on push conflicts.
    • “Promotion” into human branches via squash merges or cherry-picks.
  • Safety:
    • Never auto-rewrite user feature branches.
    • Treat autosave branches as machine-only, append-only logs that can be squashed/archived after promotion.

If you want, the next iteration can be a shell/Go daemon sketch that implements this autosave loop, including detection of ongoing rebase/merge/commit operations to avoid interfering with active Git commands. 212213214215


Conflict-resolution steps when autosave push fails due to divergence from main branch

When an autosave push fails because the autosave branch has diverged from its remote (usually because main/feature branches moved and autosave was rebased/merged against them), use a clean, automated, autosave-only reconciliation flow.


1. Detect and classify the failure

On git push origin autosave/<tenant>/<project>/<user>/<date>:

  • If error is non-fast-forward (remote has commits that local doesn’t), classify as autosave divergence, not a main-branch conflict.
  • Important: treat autosave/* as machine-owned only; humans don’t commit there, so you can safely rewrite it without affecting user branches.216217

2. Reconcile autosave branch with remote

Steps inside the Workstation’s autosave daemon:

  1. Fetch remote state
git fetch origin autosave/<...>
  1. Rebase local autosave on remote
git checkout autosave/<...>
git rebase origin/autosave/<...>
- Because both sides are machine-generated, rebase should normally be trivial (fast-forward or linear append).

3. Resolve rare conflicts (if any)

If rebase pauses with conflicts (e.g., previous autosave bug or manual edits to autosave branch): - Abort automated rebase and mark the autosave branch as “needs manual intervention”:

git rebase --abort
- Record a flag in Coditect metadata (e.g., `autosave_conflict=true`) and stop autosaving for that branch until a background “repair job” or a human resolves it.

4. Retry push

git push origin autosave/<...>

3. Keeping in sync with main / feature branches

Autosave divergence from main/feature is expected; resolution should be one-way, from autosave into user branches, never the reverse.

  • Periodically (or on user action “sync autosave”), Coditect:
  1. Checks out user’s feature branch:
git checkout feature/PROJ-123
git pull origin feature/PROJ-123
  1. Integrates autosave history via squash merge:
git merge --squash autosave/<tenant>/<project>/<user>/<date>
git commit -m "feat: integrate autosave work from 2026-01-14"
  1. Pushes feature branch and opens a PR if desired.
  • This avoids ever forcing main/feature to incorporate noisy autosave commit history and keeps autosave conflicts decoupled from “real” Git history.218216

4. Fallback strategy for persistent autosave failures

If autosave cannot rebase/push after N attempts (e.g., due to repository corruption or manual edits):

  • Stop autosave for that project/user and mark in Coditect UI: “Autosave paused due to Git conflicts.”
  • Keep local working tree intact so user doesn’t lose work.
  • Offer one-click “repair” action that:
    • Creates a new autosave branch (e.g., autosave/.../<date>-repair),
    • Commits the current workspace snapshot there,
    • Pushes and resumes autosave from that new branch.

This minimizes risk of data loss while keeping autosave history clean and machine-controlled.


USE GIT WORKTREES and research

Which branch prefix should Workstations create for autosave changes How to include ticket and workstation ID in branch names Exact commit message template for autosave snapshots How to tag autosave commits for easy cleanup

Use worktrees so autosave runs in an isolated working copy per user/ticket, and encode ticket + workstation identity directly into branch and commit metadata for traceability and cleanup.219220221222


Branch prefix and naming with worktrees

Prefix

Use a dedicated autosave prefix and ticket-first naming, so you get uniqueness + traceability:223224225

  • Prefix: autosave/
  • Full pattern:
- `autosave/<ticket-id>/<tenant>/<project>/<user>/<workstation>/<yyyymmdd>`  

Examples:

  • autosave/PROJ-1234/acme/payments/alice/ws-01/20260114
  • autosave/NOTICKET/acme/spike-service/bob/ws-nyc-02/20260114

Notes:

  • Ticket-first aligns with common branch-naming best practices and makes it easy to correlate autosave streams to work-items.226225223
  • NOTICKET (or similar) can be used when there is no tracker ID.224

Worktree usage

For each active autosave context:

git worktree add ../.wt-autosave-PROJ-1234 \
autosave/PROJ-1234/acme/payments/alice/ws-01/20260114
  • The autosave daemon operates in ../.wt-autosave-PROJ-1234, leaving the main working tree free for the developer’s feature branch.220221222219

Commit message template

Use a strict, machine-readable template:

Template

autosave: <ticket-id> <workstation-id> <user> <ISO8601-UTC>

Examples:

  • autosave: PROJ-1234 ws-01 alice 2026-01-14T04:07:15Z
  • autosave: NOTICKET ws-nyc-02 bob 2026-01-14T04:08:01Z

Rationale:

  • Starts with autosave: to differentiate from human commits and allow easy filtering.227228229
  • Contains ticket ID and workstation ID to diagnose which environment generated the snapshot.
  • ISO timestamp makes it sortable and avoids locale issues.

Optional body (for Coditect-only consumption):

files:
- payment_service.py
- handlers/webhook.go
tool:
- claude-code
- gemini
summary: "Refactor payment flow and add retry logic"

Tags for autosave commits (for cleanup and indexing)

Treat tags as lightweight markers for autosave epochs, not for every commit.230231

Tag naming

  • Prefix: autosave-
- **Pattern**: `autosave/<ticket-id>/<user>/<yyyymmdd>/<n>`  

Examples:

  • autosave/PROJ-1234/alice/20260114/001
  • autosave/PROJ-1234/alice/20260114/002

Usage:

  • Optionally tag every Nth autosave commit (e.g., every 20th) to create anchors for cleanup and recovery.
  • Store mapping in Coditect (ticket → last autosave tag).

Cleanup

  • To prune old autosave history for a ticket/date after promoting to a clean feature branch, Coditect can:
    • Enumerate tags matching autosave/PROJ-1234/alice/20260114/*.
    • Decide retention (keep last tag, delete older ones).
    • Optionally delete old autosave branches past an age threshold (e.g., > 30 days).

This keeps autosave history discoverable but bounded.


Conflict handling with worktrees

  • Autosave worktree operates solely on its autosave branch; developer worktree remains on feature/PROJ-1234-something.221222219220
  • On push non-FF: autosave process uses its worktree to git fetch origin autosave/... && git rebase origin/autosave/... and re-push, with no impact on the developer’s branch.
  • If rebase fails repeatedly, Coditect can:
    • Create a fresh autosave branch and worktree with a suffix, e.g., autosave/PROJ-1234/.../20260114-repair,
    • Commit the current state there and continue, leaving the old branch as an artifact for manual inspection.

If you want, next step can be a small shell or Go daemon loop that: creates the autosave worktree from the feature branch, watches for changes, and applies this branch + commit + tag scheme automatically. 232233234235236


Should autosave branches be local or pushed to remote

Recommended prefix for workstation autosave branches How to format branch name with ticket and workstation ID Examples of safe unique workstation identifiers Branch naming collisions handling strategy for autosave branches

Autosave branches should be pushed to the remote (for recovery and central visibility), but treated as machine-owned, low-value history that can be rotated and squashed.


Local vs remote autosave branches

  • Local-only autosave:
    • Pros: No noise in the remote, simpler Git perms.
    • Cons: No protection against Workstation loss; Coditect cannot see or manage autosave state centrally.
  • Pushed autosave branches (recommended):
    • Pros: Survives VM failure, enables Coditect-wide audit, consolidation, and cleanup; fits your “near real-time backup” goal.
    • Cons: Extra refs in the remote, but mitigated by strict naming and retention policies.

For Coditect, autosave branches should be pushed to the remote, with clear prefixes and automated pruning.


Use a distinct, machine-only prefix at the root:

  • Recommended: autosave/

Examples with hierarchy:

  • autosave/PROJ-1234/acme/payments/alice/ws-01/20260114
  • autosave/NOTICKET/corp-ml/llm-router/bob/ws-nyc-02/20260114

This keeps autosave clearly separated from feature/, bugfix/, etc., and allows easy listing/cleanup (git branch --list 'autosave/*').


Branch name format with ticket and workstation ID

Template:

`autosave/<ticket-id>/<tenant>/<project>/<user>/<workstation-id>/<yyyymmdd>`
  • <ticket-id>: JIRA-style or tracker ID (e.g., PROJ-1234), or NOTICKET when absent.
- `<tenant>` / `<project>`: short slugs, kebab-case.  
  • <user>: user slug (GitHub handle or SSO username).
  • <workstation-id>: short, stable identifier (see below).
  • <yyyymmdd>: date (UTC) to rotate branches daily.

Example:

  • autosave/PROJ-1234/acme/payments/alice/ws-01/20260114

Slugify each segment (lowercase, [^a-z0-9-]-, length-capped) to keep branch names safe and Git-friendly.


Safe unique workstation identifiers

Properties: stable per Workstation, non-sensitive, short, and unique enough within the repo.

Good options:

  • Short host-based ID (preferred):
    • E.g., ws-01, ws-nyc-02, ws-br-cj-01.
- Derived from hostname + region or team: `ws-<region>-<seq>`.  
  • Hash-based ID:
    • E.g., ws-6f3a9b, first 6–8 chars of a hash of instance ID.
  • Cloud instance metadata (normalized):
    • GCE instance name slugged: ws-<instance-name> (but keep short).

Avoid:

  • Raw IP addresses, internal hostnames that leak topology, or long opaque UUIDs; they clutter branch names and can expose infra details.

Handling branch naming collisions

A collision happens if two autosave processes produce the same branch name (e.g., two Workstations with misconfigured IDs). Strategy:

  1. Prevent collisions by design
    • Make <workstation-id> unique per repo or per project (control this in Coditect config).
    • Enforce uniqueness check at Workstation registration time in Coditect’s control plane.
  2. If collision still occurs (non-fast-forward push)

When git push origin autosave/... fails with a non-FF error: - Try automated reconciliation: - git fetch origin autosave/... - If the remote branch was also autosave from same workstation (detected via commit metadata), rebase and push. - If remote branch appears to be from a different workstation (commit messages show different workstation IDs): - Treat as collision and fork a new branch name with a numeric suffix: - Original: - autosave/PROJ-1234/acme/payments/alice/ws-01/20260114 - New (forked): - autosave/PROJ-1234/acme/payments/alice/ws-01b/20260114 - or autosave/PROJ-1234/acme/payments/alice/ws-01/20260114-2 - Update the Workstation’s autosave config to use the forked branch going forward. 3. Central coordination (Coditect) - Maintain a mapping in Coditect: (tenant, project, user, workstation-id, date) → branch name. - On registration or first autosave, the control plane assigns the canonical autosave branch name and ensures uniqueness; Workstations never invent it on their own.

This gives you:

  • Remote-backed, recoverable autosave.
  • A clear naming scheme carrying ticket + workstation identity.
  • A collision strategy that always preserves data and isolates conflicting producers into new branches.

SUMMARIZE OUR FINDINGS

Coditect’s sandbox + autosave design converged on a practical, mostly open stack with clear separation of concerns across runtimes, execution control, and Git workflows.


Sandbox runtimes and tiers

  • For a Modal-like execution boundary, gVisor, Kata+Firecracker, Firecracker alone, and Wasm runtimes all emerged as viable building blocks, each trading isolation for performance and complexity. gVisor sits between containers and the host kernel via a user-space kernel, giving better isolation than runc with near-container startup. Kata+Firecracker adds VM-grade isolation via microVMs at the cost of higher overhead and more complex node setup, while Wasm (Wasmtime/Wasmer/WasmEdge) is ideal for capability-scoped tools with very fast startup but no full Linux.237238239240241242
  • A tiered architecture on Kubernetes was outlined: gVisor for semi-trusted workloads via a gvisor RuntimeClass, Kata+Firecracker for untrusted/public workloads via a kata-fc RuntimeClass bound to special node pools, and a Wasm tier using either a Wasm-aware runtimeClass or a dedicated Wasm executor service. Runtime choice is policy-driven, based on tenant, project, and risk level, with shared observability, quotas, and audit logging across all tiers.242243244245246

gVisor on Google Cloud Workstations

  • For Coditect on Google Cloud Workstations (no Kubernetes), the recommended approach is to run gVisor (runsc) as an alternate container runtime inside each Workstation VM, treating each agent execution as a gVisor sandbox container. Google’s own use of gVisor for serverless (Cloud Run, GKE Sandbox) shows it is a good fit for untrusted multi-tenant workloads with acceptable performance.247248249250251
  • Each team gets one or more Workstations; a Coditect agent on each VM handles docker run --runtime=runsc ... with CPU/memory/time limits, mounts per-tenant project workspaces, streams stdout/stderr back to the control plane, and periodically reports resource usage for quotas and billing. gVisor provides an extra isolation boundary inside each VM without needing nested virtualization or microVM orchestration.250251252253

Control plane ↔ agent gRPC API

  • A minimal gRPC API was defined with a SandboxAgent service exposing CreateSandbox, ExecSandbox (bidirectional stream), DestroySandbox, and ReportMetrics (client stream). This shape matches gRPC streaming best practices for long-lived exec sessions and metrics telemetry.254255256257
  • A Go agent implementation sketch uses Docker with the runsc runtime to launch sandbox containers, wires docker exec stdout/stderr into the ExecSandbox stream, reads stdin and control messages from the control plane, and reports exit codes and resource usage. Authentication and authorization are handled via mTLS (X.509 certs with SAN/SPIFFE IDs) and gRPC metadata, with the control plane enforcing tenant/project quotas before calling the agent.252258259260261262

Streaming and lifecycle best practices

  • For stdout/stderr over unstable networks, the design uses chunked messages (e.g., 32 KiB), backpressure via gRPC’s flow control, optional application-level sequence numbers, heartbeat ExecStatus messages, and the ability to reconnect and reattach to running sandboxes, with local log files as a fallback. These patterns follow gRPC streaming guidance for robust long-running streams.256257262263254
  • gVisor lifecycle guidance includes: minimal immutable images; strict per-container CPU/memory/time limits; default-deny or tightly restricted network; periodic reconciliation to clean up orphan containers; and testing workloads under gVisor to catch syscall/compatibility quirks.251253264265266267250

Git autosave with worktrees and remote branches

  • Autosave is treated as a machine-owned, remote-backed history to provide near–real time backup and central visibility. Dedicated autosave branches are pushed to the remote, using a prefix like autosave/ and a hierarchical name including ticket, tenant, project, user, workstation, and date, e.g.: autosave/PROJ-1234/acme/payments/alice/ws-01/20260114. This follows general branch-naming best practices while keeping autosave clearly segregated from human branches.268269270271272
  • Git worktrees are used to isolate autosave operations from the user’s main working copy: a separate worktree pointed at the autosave branch lets the autosave daemon commit and push snapshots without disturbing the developer’s current branch. Prior art and guides show this pattern is well-suited for multi-branch workflows and AI tooling.273274275276
  • Commit messages follow a strict template such as autosave: PROJ-1234 ws-01 alice 2026-01-14T04:07:15Z, making them easy to filter and audit. Autosave-only tags like autosave/PROJ-1234/alice/20260114/001 can mark epochs for cleanup. Conflict handling uses automated fetch+rebase on autosave branches (machine-only), with collision detection on workstation IDs and, if needed, creation of new suffixed branch names (e.g., .../20260114-2), while leaving user feature branches untouched. This aligns with Git guidance to keep main history clean while allowing frequent automated commits.277278279280281

CREATE A SDD TDD AND ADR describing the CODITECT GOOGLE CLOUD WORKSTATION IMPLEMENTATION as research, analyzed and recommended

Below is a concise SDD, TDD, and ADR set for the Coditect + Google Cloud Workstations + gVisor implementation as designed and researched.


1. SDD – System Design Document

1.1 Overview

Coditect runs multi-tenant, agentic development workflows on Google Cloud Workstations, using gVisor (runsc) inside each Workstation VM as a Modal-like sandbox runtime for untrusted agent code.282283284285

Core elements:

  • Google Cloud Workstations cluster (controller + gateway) providing per-team IDE VMs with persistent disks and VPC access.283284
  • Per-VM Coditect Agent exposing a gRPC SandboxAgent service and managing docker --runtime=runsc containers.286287
  • Coditect Control Plane (API + Orchestrator) that exposes /sandboxes to clients, enforces quotas and IAM, and orchestrates Workstations/agents.
  • Git-backed project workspaces with autosave branches using Git worktrees and machine-owned branches pushed to GitHub for near real-time persistence.288289290291

1.2 Architecture components

  1. Google Cloud Workstations
    • Managed cluster per region; Workstations are GCE VMs managed by the Workstations controller and reachable via a gateway.284283
    • Each Workstation VM has:
      • Docker or containerd configured with gVisor runsc runtime.287282286
      • Coditect Agent daemon (gRPC server).
- Workspace root: `/workspaces/<tenant>/<user>/<project>` with persistent disk.  
  1. gVisor sandbox runtime
    • Installed via runsc install and configured as Docker runtime runsc.286287
    • Sandbox containers launched by agent as:
docker run --runtime=runsc \
--cpus=<limit> --memory=<limit> \
--network=none --read-only \
-v /workspaces/...:/workspace \
--name coditect-sb-<id> <image> sleep infinity
- Provides stronger isolation between agent code and Workstation OS by interposing a user-space kernel.[^17_11][^17_12][^17_13][^17_4][^17_1]

3. Coditect Control Plane - Exposes HTTP API (/sandboxes, /exec, /destroy) to Coditect UI and orchestration agents. - Maintains metadata DB (tenants, projects, sandboxes, quotas, autosave branches). - Implements gRPC client to per-Workstation SandboxAgent. 4. SandboxAgent gRPC API (per Workstation) - Proto (summarized): - CreateSandbox(CreateSandboxRequest) -> CreateSandboxResponse - ExecSandbox(stream ExecRequest) <-/-> (stream ExecEvent) (bidirectional).292293294 - DestroySandbox(DestroySandboxRequest) -> DestroySandboxResponse - ReportMetrics(stream MetricsReport) -> MetricsAcknowledge - Control Plane selects a Workstation, calls CreateSandbox, then runs exec sessions via ExecSandbox. 5. Git autosave + worktrees - For each (tenant, project, user, ticket, workstation, date) Coditect creates a worktree checked out to an autosave branch:289290291288

Branch pattern:

`autosave/<ticket-id>/<tenant>/<project>/<user>/<workstation-id>/<yyyymmdd>`  
- Autosave daemon in the Workstation’s context:
- Watches workspace changes.
- Periodically `git add -A`, `git commit` with machine template, and `git push origin autosave/...`.

1.3 Data flows

  1. Sandbox lifecycle
    • Client → Control Plane: POST /sandboxes with tenant/project/session info.
    • Control Plane: quota + IAM checks; chooses Workstation; calls CreateSandbox on its agent.
    • Agent: launches gVisor container and returns sandbox/container IDs.
    • Control Plane: records sandbox metadata and returns sandbox handle.
  2. Execution + streaming
    • Client → Control Plane: POST /sandboxes/{id}/exec.
    • Control Plane ↔ Agent: ExecSandbox stream.
      • Control Plane sends ExecStart and optional ExecStdin.
      • Agent streams ExecStdout, ExecStderr, ExecStatus, ExecExit.293292
  3. Metrics + quotas
    • Agent periodically sends MetricsReport (CPU seconds, peak mem, bytes I/O) to Control Plane.295296297
    • Control Plane updates usage counters per tenant/project and may deny new sandboxes or terminate existing ones when quotas exceeded.
  4. Git autosave
    • Autosave daemon operates in autosave worktree, committing snapshots and pushing to GitHub.298288
    • Coditect central DB tracks mapping: (tenant, project, user, ticket, workstation, date) -> autosave branch.

1.4 Non-functional requirements

  • Security:
    • gVisor sandbox isolating untrusted code from Workstation host kernel and other workloads.299300301285282
    • mTLS between Control Plane and Agents, strict RBAC in Control Plane.302303
  • Reliability:
    • Resilient streaming with backpressure and reconnect semantics for stdout/stderr.294304292
    • Autosave branches on remote Git for recovery if Workstation fails.
  • Performance:
    • gVisor performance tuned with recent FS improvements (VFS2/LISAFS) to keep overhead close to containers for typical I/O patterns.297305

2. TDD – Technical Design Details

2.1 gVisor configuration on Workstations

  • Install runsc from gVisor releases.306287
  • sudo runsc install to add runsc runtime to Docker and update daemon.json.287286
  • Restart Docker; test with docker run --runtime=runsc hello-world.286
  • Hardening: configure Docker to use cgroupfs when required by gVisor, per docs.287

2.2 SandboxAgent implementation (Go)

  • gRPC server with TLS + mTLS; SandboxAgent service from proto.307308302
  • CreateSandbox builds docker run args and executes them using exec.CommandContext.
  • ExecSandbox uses bidirectional streams:292293294
    • On first message (with ExecStart), start docker exec and attach to stdout/stderr.
    • Forward stdout/stderr as chunked ExecStdout/ExecStderr events.
    • Accept ExecStdin messages and write to process stdin.
    • On termination or timeout, send ExecExit.
  • ReportMetrics reads from Docker stats/cgroup FS and streams metrics periodically to Control Plane.296295

2.3 Control Plane internals

  • Scheduler:
    • Workstation registry with capacity metrics.
    • Placement algorithm (simple round-robin, least-loaded, or tenant-aware).
  • Quota service:
    • DB schema for per-tenant/project quotas and usage counters.
    • Atomic operations to “reserve” and “release” capacity when sandboxes start/stop.
  • IAM:
    • JWTs or session tokens tying user to tenant/project; Control Plane enforces ACLs at /sandboxes API.
  • Audit logging:
    • Append-only event log of sandbox create/exec/destroy with tenant/project IDs, user, and model (e.g., Claude/Gemini) context.

2.4 Git autosave + worktrees details

  • For each active ticket/workstation/project:
git worktree add ../.wt-autosave-$ID autosave/<ticket>/<tenant>/<project>/<user>/<ws>/<yyyymmdd>
  • Autosave daemon in that worktree:290291288289
    • Debounced loop: if dirty and no index.lock, then:
git add -A
git commit -m "autosave: PROJ-1234 ws-01 alice 2026-01-14T04:07:15Z"
git push origin autosave/...
  • On non-FF push: git fetch + git rebase origin/autosave/... and retry; if conflict persists, create suffix branch and update mapping.

2.5 Integration with Claude Code and Gemini

  • Control Plane keeps project context (Git repo URLs, file trees, tests).
  • When user invokes AI action, Control Plane:
    • Pulls autosave branch or feature branch;
    • Calls Claude Code or Gemini with repo snapshot and task;
    • Writes modifications into workspace;
    • Triggers tests inside gVisor sandbox;
    • Autosave daemon snapshots changes to GitHub.

3. ADR – Architecture Decision Record

Title: Use Google Cloud Workstations with gVisor (runsc) and Workstation-local gRPC agents for Coditect sandboxes and Git autosave.

Context

Coditect must execute untrusted, multi-tenant agent code for software development workflows, integrate with GitHub for near real-time project backups, and support AI agents (Claude Code, Gemini) operating over codebases. Requirements include:

  • Stronger isolation than plain containers on shared hosts.
  • Economic viability and simplicity on GCP.
  • Deep IDE integration via Google Cloud Workstations.283284
  • Centralized quotas, audit, and streaming execution control for sandboxes.

Decision

  1. Use Google Cloud Workstations as the primary execution environment, with each Workstation being a GCE VM managed by Workstations controller/gateway in Coditect’s VPC.284283
  2. Install gVisor’s runsc runtime on each Workstation and configure Docker to use runsc for sandbox containers, leveraging gVisor’s user-space kernel for added isolation.300282286287
  3. Deploy a Coditect Agent on each Workstation that exposes a gRPC SandboxAgent service, implements sandbox lifecycle via Docker+runsc, streams stdout/stderr, and reports metrics.293294292
  4. Keep a central Coditect Control Plane exposing /sandboxes to clients, performing IAM + quota checks, and orchestrating work across Workstations and agents.
5. **Implement Git autosave using remote-pushed autosave branches and git worktrees**, with branches named `autosave/<ticket>/<tenant>/<project>/<user>/<workstation>/<yyyymmdd>` and machine-generated commits pushed to GitHub, giving near real-time backup.[^17_10][^17_28][^17_7][^17_8][^17_9]

Rationale

  • gVisor is designed as an application kernel for containers and is widely used within Google’s own serverless and container platforms, providing a well-tested isolation layer suitable for untrusted code.301285282299300
  • Google Cloud Workstations architecture already gives per-team VMs, persistent disks, and VPC connectivity, and can be centrally managed and snapshotted.283284
  • Using gRPC streaming between control plane and agents matches best practices for long-lived process I/O (stdout/stderr) and metrics, and keeps the control plane simple and language-agnostic.304294292293
  • Git worktrees and autosave branches provide local isolation and remote resiliency: work continues even if autosave fails, and GitHub holds recovery points if a Workstation dies.288289290291

Alternatives considered

  • Kubernetes + Kata/Firecracker: Stronger isolation but higher operational cost and complexity on GCP; less aligned with the Workstations-first developer experience.309310311
  • Remote-only autosave (no worktrees): Simpler but interferes with the developer’s active branch and makes automation riskier; worktrees were chosen to isolate automation.312313

Consequences

  • Pros
    • Stronger isolation than vanilla containers with minimal changes to the developer experience.285282300301
    • Clear separation between Control Plane and Workstation Agent responsibilities.
    • Robust autosave and recovery with GitHub as the durable store.
  • Cons
    • Requires OS-level access to Workstations to install runsc and configure Docker.306286287
    • gVisor compatibility quirks must be tested for typical dev workloads and tools.314315297

This ADR, SDD, and TDD give a concrete blueprint for implementing the Coditect Google Cloud Workstation architecture as designed in the research. 316317


Define the scope and success criteria for the SDD TDD and ADR

List stakeholders and approval authorities for the documents Provide an outline template for the SDD specific to Cloud Workstations Provide an outline template for the TDD focused on gVisor and runsc Provide an ADR template with decision options and tradeoffs section

For Coditect’s Google Cloud Workstations + gVisor effort, the SDD, TDD, and ADR should have explicit scope, measurable success criteria, clear owners, and reusable templates tailored to this stack.


Scope and success criteria

SDD (System Design Document)

  • Scope
    • End-to-end system architecture for Coditect on Google Cloud Workstations: control plane, Workstations, Coditect agent, gVisor sandboxes, Git autosave, IAM, quotas, and observability.
  • Success criteria
    • All stakeholders can describe the system at a high level from the SDD alone.
    • Architecture decisions are consistent with GCP Workstations and gVisor capabilities (no “magic infra”).318319320
    • Interfaces between components (API, gRPC, Git, IAM) are unambiguous enough to drive TDD work.

TDD (Technical Design Details)

  • Scope
    • Implementation-level design for gVisor and runsc integration on Workstations, the SandboxAgent gRPC service, and autosave Git worktrees.
    • Container launch and lifecycle, resource limits, metrics collection, and integration with Docker/containerd.321322323324
  • Success criteria
    • Go/Rust implementation teams can build the agent and control plane without guessing behaviors.
    • Operational teams can configure runsc and Docker on Workstations using only TDD steps.325321
    • gRPC APIs and message schemas are stable enough to generate client/server stubs and tests.326327328

ADR (Architecture Decision Record)

  • Scope
    • Capture major architectural choices (Workstations vs GKE, gVisor vs microVMs, autosave strategy, etc.), rationale, and implications.
  • Success criteria
    • Future engineers can understand why Workstations + gVisor was chosen and what alternatives were rejected.
    • Changes to the architecture can be evaluated against documented decisions and tradeoffs.

Stakeholders and approval authorities

  • Product / Platform Lead (Coditect)
    • Owns overall platform direction and approves SDD/ADR alignment with product roadmap.
  • Chief Architect / Principal Engineer
    • Accountable for SDD and ADR technical soundness and long-term maintainability.
  • Infra / DevOps Lead
    • Approves TDD sections on Workstations provisioning, Docker/containerd + gVisor setup, monitoring, and rollout.319322318325
  • Security / Compliance Officer
    • Reviews SDD/TDD for sandbox isolation, IAM, audit logging, and data handling; signs off on ADR security tradeoffs (gVisor vs microVM).320323329330
  • Team Leads (Agent Orchestration, Git Integration)
    • Ensure SDD/TDD requirements are implementable by their teams; sign off on scope and milestones.

Approval suggestion:

  • SDD: Product Lead + Chief Architect + Security.
  • TDD: Chief Architect + Infra Lead + relevant Team Leads.
  • ADR: Chief Architect + Security Officer (and Product Lead if impact is high).

SDD outline template (Cloud Workstations–specific)

1. Document control

  • Version, date, author.
  • Reviewers and approvers (names/roles).
  • Related ADRs and TDDs.

2. Overview

  • Purpose and scope (Coditect on GCP Workstations).
  • Objectives (multi-tenant agent compute, near real-time Git persistence, safe untrusted code execution).

3. System context

  • Context diagram:
    • Coditect Control Plane, Google Cloud Workstations, GitHub, Anthropic Claude, Google Gemini, identity provider.318319
  • External dependencies (GCP services, GitHub, auth providers).

4. High-level architecture

  • Components:
    • Workstations cluster (controller/gateway, per-team VMs).319318
    • Coditect Control Plane.
    • Workstation Agent + gVisor runtime.320321
    • GitHub and autosave branches.
  • Deployment topology (regions, VPCs, projects).

5. Workstation and sandbox model

  • Workstation lifecycle (provisioning, scaling, deprovisioning).318319
  • Sandbox abstraction (one gVisor container per sandbox).323324330320
- Workspace layout (`/workspaces/<tenant>/<user>/<project>`).  

6. Control Plane responsibilities

  • /sandboxes API surface.
  • Scheduling logic (Workstation selection).
  • Quota enforcement and billing.
  • IAM model (tenants, projects, users, roles).

7. gRPC and messaging

  • Description of SandboxAgent gRPC services and message flows (Create/Exec/Destroy/ReportMetrics).327328326
  • Error handling and retry semantics.

8. Git integration and autosave

  • Git repository mapping (tenant/project → repo).
  • Autosave branch naming and worktree strategy.331332333334
  • Promotion from autosave to feature branches.

9. Non-functional requirements

  • Security (gVisor isolation, network policies, mTLS).329330335323320
  • Reliability and availability (Workstation/node failure behavior).
  • Performance expectations (latency, throughput, cost).336337
  • Observability (logging, metrics, tracing).

10. Risks and open questions

  • gVisor compatibility hot spots.338339340
  • Workstations lifecycle edge cases.
  • Future evolution (microVM tier, Wasm tier).

TDD outline template (gVisor + runsc–focused)

1. Document control

  • Version, date, author, reviewers.

2. Purpose and scope

  • Detailed design for:
    • gVisor installation and configuration on Workstations.
    • Docker/containerd runtime integration.
    • SandboxAgent implementation.
    • Metrics, logs, and lifecycle policies.

3. Workstation environment

  • Base OS/image and Workstations configuration.319318
  • Required packages (Docker/containerd, runsc, etc.).322325
  • Security hardening (user accounts, SSH, file permissions).

4. gVisor (runsc) setup

  • Installation steps (commands, versions) referencing gVisor docs.321322325
  • Docker/containerd configuration snippets (daemon.json, runtime definitions).
  • Validation tests (docker run --runtime=runsc hello-world).339321

5. Sandbox lifecycle implementation

  • Container naming and labels (coditect-sb-<id>).
  • CreateSandbox behavior (CPU/mem/network/volume args).
  • Exec behavior (PTY support, working dirs, env).
  • Destroy behavior and cleanup (timeouts, orphan detection).
  • Lifecycle state machine and transitions.

6. SandboxAgent gRPC server

  • Service definitions (from proto).328326327
  • Go package layout (agent binary, config, logging).
  • Streaming implementation details:
    • stdout/stderr buffering and chunk size.
    • Stdin handling and control messages.
    • Heartbeats and idle detection.

7. Metrics and logging

  • Metrics collection (Docker stats, cgroups, sampling interval).337341342
  • Mapping to MetricsReport fields and quota counters.
  • Log routing from Workstations to central logging (e.g., via fluentd/Vector).

8. Security and authN/Z

  • mTLS configuration (cert distribution, rotation, validation).335343
  • Agent-side checks on caller identity (peer cert SANs).
  • Least-privilege OS users and Docker group configuration.

9. Failure handling and resiliency

  • Behavior on gRPC disconnects (Exec retries, metrics stream reconnection).344328
  • Handling of Docker failures and runsc errors.
  • Health checks for Agent and gVisor.

10. Testing strategy

  • Unit tests (command building, gRPC handlers).
  • Integration tests (agent + control plane on a real Workstation).
  • Performance and soak tests under gVisor (CPU/I/O patterns).337339

ADR template (with options and tradeoffs)

Title: <Short decision title> Status: Proposed | Accepted | Superseded Date: <YYYY-MM-DD> Authors: <names>

1. Context

  • Problem statement and constraints.
  • Relevant background (e.g., Coditect’s need for sandboxing untrusted code on GCP Workstations).330320318319

2. Decision

  • Clear statement of the chosen option.
  • Brief description of the solution (e.g., “Use gVisor (runsc) on Workstations with Docker, plus per-VM SandboxAgent gRPC service.”).324323330320321

3. Options considered

List each option with a short summary:

  • Option A: GCP Workstations + gVisor + Docker/runsc (chosen).330320321
  • Option B: GKE + Kata Containers/Firecracker.345346347
  • Option C: GKE + gVisor (GKE Sandbox) with remote IDE.348349
  • Option D: Direct Firecracker microVM pool on GCE.350

4. Tradeoffs

For each option, discuss:

  • Security
    • Isolation strength, blast radius, kernel sharing vs VM boundary.347323345350
  • Performance
    • Cold start, I/O overhead, CPU/memory footprint.336337
  • Operational complexity
    • Setup effort (Workstations vs GKE vs custom microVM), support, upgrades.346318319
  • Developer experience
    • IDE quality, latency, friction for AI-assisted workflows.
  • Cost
    • VM count, node types, overhead from isolation layer.

Clearly highlight why the chosen option wins overall in this context (e.g., best balance of security, DX, and implementation time).

5. Consequences

  • Positive
    • List benefits and what becomes simpler.
  • Negative / Risks
    • New complexity, vendor dependencies, potential performance or compatibility issues.338339337
  • Mitigations
    • How risks will be handled (e.g., compatibility testing, fallbacks to plain containers/microVMs).
  • Links to other ADRs (e.g., autosave strategy, multi-runtime tiering, AI model choices).

This gives you a structured way to lock in the Workstations + gVisor direction while keeping future alternatives explicit and comparable.


CODITECT runs DJANGO backend with front-end REACT TYPESCRIPT strict for user, tenant, project, license management with JWT tokens for individuals with RBAC - analyze and describe how this integrates int the GOOGLE CLOUD WORKSTATION provisioning, orchestration, gVisor assignements for individuals working within shared Google Cloud Workstations

Coditect’s Django/React SaaS becomes the control plane and identity source for who can use which Cloud Workstations and which gVisor sandboxes run on them, with JWT+RBAC driving provisioning, orchestration, and sandbox assignment.351352353354355356357


Identity, JWT, and RBAC as the control layer

  • Django issues JWTs per user that embed user_id, tenant_id, project_ids, and role claims (roles: ["tenant_admin", "developer", ...]).
  • React TS frontend authenticates via these JWTs and calls Coditect’s backend (Django REST / GraphQL), not GCP directly.
  • Coditect backend maps its internal RBAC to Google Cloud IAM:
    • E.g., a tenant admin can request workstation templates but not edit cluster-wide settings.352358351
  • JWT subject and tenant/project claims are the canonical identity for:
    • Which Workstation(s) a user may attach to.
    • Which sandboxes (gVisor containers) they may start, exec into, or destroy on those Workstations.354359

Workstation provisioning driven by Django

  • Cloud Workstations are provisioned via GCP APIs or Terraform/Pulumi from a Coditect service account, not from the browser.360361356357
  • Django integrates user/tenant/project models with Workstation configs:
    • For each tenant/team, Coditect stores the Workstation cluster and one or more workstation configs (image, machine type, disk, tools).358356357
    • When a user creates or resumes a dev session from the React UI, Django:
      • Validates they have a role allowing dev sessions in that project.
      • Calls the Workstations API (with roles/workstations.workstationUser or workstationCreator bound to its service account) to create or attach to a workstation.351352354358
  • IAM bindings are managed centrally: tenant or team groups are bound to Workstation configs with appropriate roles; individual user access comes from Coditect’s RBAC layer mapping to these groups or directly to Workstations.352360358351

gVisor sandbox assignment per user in shared Workstations

Within each Workstation VM, a Coditect Agent runs, which trusts the control plane’s JWT-bearing requests and never exposes raw Docker/gVisor to end-users directly:

  • The React frontend calls Django with the user’s JWT to request sandbox creation or execution.
  • Django validates RBAC (e.g., project:execute_sandbox) and then calls the Agent over gRPC as the only caller, using mTLS service identity and passing user/tenant/project context in metadata.359362363
  • The Agent runs per-sandbox gVisor containers (docker run --runtime=runsc ...) tagged with labels derived from the JWT claims: tenant_id, project_id, user_id, sandbox_id.364355365366367
  • This allows:
    • Multi-user, multi-tenant isolation inside a shared Workstation: each user’s agent instructions correspond to separate gVisor sandboxes, giving a user-space kernel boundary between them.353355364
    • The control plane to enforce per-user and per-project quotas on concurrent sandboxes and resource usage before a CreateSandbox call is made.

The net effect: Django’s JWT+RBAC governs not only access to Coditect features but also which gVisor sandboxes can be created and on which Workstations, with Google IAM governing underlying Workstation access.353354358351352 368369370371372


Map required IAM roles and permissions for each service component

Describe JWT token flow and refresh strategy for tenants Design RBAC role hierarchy for users tenants and projects Specify secure storage and rotation for signing keys List API endpoints that need tenant isolation checks only once per request cycle is required for caching or optimization

Coditect’s stack needs clear separation between Google IAM (for Workstations) and app-level JWT/RBAC (for tenants/projects). Below is a concise mapping and design.


IAM roles and permissions per component

1. Coditect Control Plane (Django backend) Service account: coditect-control-plane-sa

  • Project-level roles on the GCP project(s) hosting Workstations:
    • roles/workstations.admin for creating/managing workstation clusters/configs (if Coditect manages infra).373374
    • Or narrower:
      • roles/workstations.workstationCreator to create Workstations from configs.374375
      • roles/workstations.workstationUser if only starting/stopping and connecting.375374
  • Possibly roles/iam.serviceAccountUser on coditect-workstation-agent-sa if Workstations run agents with that SA and need to impersonate.376374

2. Workstation Agent (per VM) Service account: coditect-workstation-agent-sa

  • Minimal roles:
    • roles/logging.logWriter to send logs to Cloud Logging.
    • roles/monitoring.metricWriter if directly pushing metrics.
  • No direct Workstations API access needed; it only talks to the Control Plane via gRPC.

3. CI / Infra automation Service account: coditect-infra-sa

  • roles/workstations.admin to create/update Workstations clusters/configs.373374
  • roles/iam.serviceAccountAdmin only if managing service accounts for agents.

4. Human users

  • Google IAM roles for direct Workstations usage (if ever used outside Coditect):
    • Typically roles/workstations.user or roles/workstations.workstationUser mapped to groups, but ideally humans only interact via Coditect UI.377374375

JWT token flow and refresh strategy

Claims (access token)

  • Standard: sub, iat, exp, iss.
  • Custom:
    • tenant_id
    • user_id
    • project_ids (or current project)
    • roles (tenant/global: ["tenant_admin", "project_admin", "developer"])
    • Optional workstation_id when bound to a session.

Flow

  1. User logs in via SSO/OIDC; Django maps identity to tenant_id and roles.
  2. Django issues a short-lived access JWT (e.g., 15–30 minutes) signed with HS256/RS256.378379
  3. React TS frontend attaches this JWT in Authorization: Bearer header for API calls.
  4. Django validates token and uses claims for RBAC and tenant isolation checks per request.

Refresh

  • Maintain a longer-lived refresh token (HTTP-only, Secure cookie) mapped server-side to user/device.
  • When access token is near expiry, frontend calls /auth/refresh; Django:
    • Validates refresh token.
    • Issues new access JWT with updated claims (e.g., changed roles/permissions).
  • Immediate revocation: server-side invalidation list keyed by refresh token ID; access JWTs naturally expire soon.

RBAC role hierarchy (users, tenants, projects)

Structure: tenant-scoped roles + project-scoped roles.

Tenant-level roles

  • tenant_owner
    • Full management of tenant settings, billing, all projects and Workstations within tenant.
  • tenant_admin
    • Manage projects, users, licenses; cannot change billing/legal.
  • tenant_auditor
    • Read-only access to logs, audit, and project configs.

Project-level roles

  • project_admin
    • Manage project membership, settings, Workstation configs for that project.
    • Can create/destroy sandboxes and adjust quotas within limits set by tenant.
  • developer
    • Create/exec/destroy sandboxes within project.
    • Access project repo, autosave, AI tools (Claude/Gemini) according to policies.
  • viewer
    • Read-only access to logs, code (if allowed), no sandbox execution.

Role mapping and evaluation

  • JWT contains both tenant and project roles, e.g.:
{
"tenant_id": "t-acme",
"user_id": "u-alice",
"tenant_roles": ["tenant_admin"],
"project_roles": {
"proj-foo": ["project_admin"],
"proj-bar": ["developer"]
}
}
  • On each request, Django:
    • Checks tenant-level role for tenant-scoped endpoints (user management, workstation config).
    • Checks project role for project-scoped endpoints (sandboxes, autosave, AI runs).

Hierarchy:

  • tenant_ownertenant_admin{project_admin, developer, viewer}
  • project_admin{developer, viewer}

Secure storage and rotation for signing keys

Key types

  • Access/refresh token signing keys (JWT).
  • mTLS certs/keys for gRPC between Control Plane and Agents.380381

Storage

  • Store JWT signing keys in a managed KMS (e.g., Google Cloud KMS) and never embed in images/env vars.378
    • Django uses KMS to sign/verify tokens or loads keys from KMS at startup with caching.
  • Store mTLS certs/keys in:
    • Secret manager or KMS, distributed to Workstations via startup scripts or Workstation images.382383384

Rotation

  • JWT signing:
    • Use a key ID (kid) in JWT header and maintain a keyset (current + previous).
    • Rotate keys periodically (e.g., every 90 days) by introducing new key, updating keyset, and invalidating old one once old tokens expire.
  • mTLS certs:
    • Issue short-lived certs per agent (e.g., via internal CA or GCP CA Service).
    • Implement automated renewal and hot-reload on agents and control plane.

Endpoints requiring tenant isolation checks (once per request)

These are endpoints where you must validate tenant and project from JWT against the resource, but can then cache the result for the duration of the request to avoid repeated lookups:

  • Auth-related endpoints
    • POST /auth/refresh (validates tenant/user but no heavy tenant lookup).
  • Tenant-scoped management
    • GET /tenants/{tenant_id}
    • GET/POST /tenants/{tenant_id}/users
    • GET/POST /tenants/{tenant_id}/workstations/configs
    • GET/POST /tenants/{tenant_id}/projects
  • Project-scoped execution and resources
    • POST /projects/{project_id}/sandboxes
    • POST /projects/{project_id}/sandboxes/{id}/exec
    • DELETE /projects/{project_id}/sandboxes/{id}
    • GET /projects/{project_id}/sandboxes (list)
    • GET/POST /projects/{project_id}/autosave/* (branch info, promotion)
    • POST /projects/{project_id}/ai/claude-run
    • POST /projects/{project_id}/ai/gemini-run

Pattern:

  1. At the beginning of each request, Django:
    • Extracts JWT; verifies signature and expiry.
    • Resolves tenant + project membership and roles once (potentially cached in Redis per token ID).
  2. Stores a request-scoped context (tenant, projects, roles).
  3. Downstream handlers and database calls rely on this context instead of re-checking tenant isolation repeatedly.

This keeps tenant isolation robust and explicit while minimizing redundant checks on a per-request basis.


ANYTHING ELSE WE SHOULD EXPAND ON?

Two areas are worth expanding next, given your goals and stack:


1. Security / abuse-resistance of sandboxes

For Coditect-as-SaaS, you likely want a more explicit security model for:

  • Threat modeling
    • Enumerate attacker types (malicious tenant, compromised agent, model-generated exploit) and what gVisor + VM boundaries do or do not protect against.385386387388
  • Network controls inside Workstations
    • Egress controls per sandbox (DNS, HTTP, cloud metadata access), ideally via iptables/ebpf or a local proxy.389390
  • File system & secrets isolation
    • Exact pattern for which secrets are visible to sandboxes (e.g., no direct cloud creds; only tightly scoped per-project tokens).
  • Abuse detection
    • Metrics/heuristics for detecting abuse: long-running CPU-bound loops, port scans, mass outbound requests, etc., and how the control plane auto-kills or bans.

You could capture this as a Security Design Addendum to the SDD/TDD, tying gVisor’s documented guarantees and limitations into concrete policies.391392385


2. Developer UX and AI workflow integration

You’ve defined infra and autosave; two UX layers would benefit from a design pass:

  • Session model and UI semantics
    • How Coditect represents “I’m working in project X on Workstation Y with sandboxes Z1–Zn” in the React UI and APIs.
    • Standard flows: start session, resume session, sync autosave -> feature branch, run tests via Claude/Gemini, review and promote changes.393394395
  • AI tool contracts
    • Define a small, explicit “AI tool” API surface (for Claude Code / Gemini) that describes:
      • What they can read/write (files, branches).
      • How they request sandboxes (e.g., a “run tests” tool always goes through /projects/{id}/sandboxes).
    • Logging / provenance format tying AI actions to Git commits and sandbox executions for audit and explainability.

If you want to go deeper, good next prompts would be:

  • “Design a security threat model and concrete mitigations for Coditect’s gVisor-based sandboxes on Google Cloud Workstations.”
  • “Design the user-facing ‘project session’ model and API for Coditect (sessions, sandboxes, autosave, and AI runs) and how it’s represented in the React TS frontend.”

Metrics/heuristics for detecting abuse: long-running CPU-bound loops, port scans, mass outbound requests, etc., and how the control plane auto-kills or bans.

You could capture this as a Security Design Addendum to the SDD/TDD, tying gVisor’s documented guarantees and limitations into concrete policies.​ “Design a security threat model and concrete mitigations for Coditect’s gVisor-based sandboxes on Google Cloud Workstations.” “Design the user-facing ‘project session’ model and API for Coditect (sessions, sandboxes, autosave, and AI runs) and how it’s represented in the React TS frontend.”

Two concrete extensions make sense: a Security Design Addendum (threat model + mitigations + abuse detection) and a Project Session model (how users see sessions/sandboxes/autosave/AI runs).


1. Security Design Addendum: gVisor sandboxes on Cloud Workstations

1.1 Threat model

Assets

- Tenant code and data in `/workspaces/<tenant>/<user>/<project>`.  
  • Coditect secrets (GitHub tokens, model keys, internal APIs).
  • Workstation VMs and underlying GCP project.
  • Control plane (Django, DB, audit logs).

Adversaries

  • Malicious tenant user (tries to escape sandbox, exfiltrate data or abuse compute).
  • Compromised user account (legit user’s JWT stolen).
  • Malicious or buggy AI-generated code (infinite loops, network abuse).
  • Compromised Workstation (agent host taken over).

Trust boundaries

  • gVisor sandbox boundary between untrusted workload and Workstation kernel.396397398
  • VM boundary between Workstations and other GCP workloads.399400401
  • mTLS + RBAC between Control Plane and Agents.402403

1.2 Concrete mitigations

Sandbox isolation

  • All agent code runs in containers with --runtime=runsc, --network=none (or very constrained egress), --read-only rootfs, and fixed CPU/memory limits.404405406396
  • Each sandbox mounts only its project workspace and ephemeral scratch; no host paths, no Docker socket.

Secrets & identity

  • No cloud credentials or GitHub tokens inside sandbox by default; AI tools and Git operations are brokered via Coditect backend.
  • Per-project, scoped tokens if absolutely needed (e.g., Git LFS or artifact fetch).

Network controls

  • Default network=none for most sandboxes; “networked” sandboxes use:
    • Egress proxy with allowlists (GitHub, package registries).
    • Egress quotas (requests/hour) and rate limiting per tenant.

Abuse detection metrics/heuristics

Collected via MetricsReport from agent + host-level firewalls/logs:

  • CPU abuse
    • High CPU utilization over threshold (e.g., >80% of core) for >N seconds with no I/O.
    • Many sandboxes at or near CPU limit for same tenant.
    • Mitigation:
      • Hard per-sandbox CPU time limit.
      • Tenant-level CPU budget (vCPU-seconds per hour); auto-throttle or reject new sandboxes when exceeded.
  • Memory abuse
    • Repeated OOM kills by same tenant or sandbox pattern.
    • Rapid growth of memory usage without progress signals (no logs).
    • Mitigation:
      • Strict mem limits per sandbox; repeated OOM → cool-down for tenant/project.
  • Network abuse (for allowed-network sandboxes)
    • High rate of outbound connections to distinct IPs (port scan signature).
    • Large outbound volume to non-approved domains.
    • Mitigation:
      • Egress proxy detecting port scans / connection bursts.
      • Auto-kill sandbox on detection; temporarily block tenant from networked sandboxes.
  • Filesystem abuse
    • Excessive writes (GiB/min) or inode creation in workspace or scratch.
    • Mitigation:
      • Quotas on workspace volume size and inode count.
      • Kill sandboxes exceeding thresholds; alert.
  • Command behavior heuristics
    • Detect repeated fork bombs, suspicious binaries, or known exploit toolchains via process monitoring inside sandbox (as far as gVisor allows), plus signatures in stdout/stderr.406407

Automated responses

  • Sandbox-level:
    • Hard kill (SIGKILL) + mark run as “abuse suspected”.
    • Lock that sandbox ID and do not permit further execs.
  • Project-level:
    • Temporary throttle (e.g., max 1 concurrent sandbox for 30 minutes).
    • Require manual approval for networked sandboxes.
  • Tenant-level:
    • If multiple projects trigger abuse heuristics within a time window, soft-ban network access or sandbox creation, pending admin review.

All actions logged to audit logs with tenant/project/user/sandbox IDs for post-incident review.


2. Project Session model and API (React TS + Django)

2.1 Conceptual model

Entities

  • Tenant: organization.
  • Project: codebase + configuration (Git repo, AI tools enabled, quotas).
  • Session: a developer’s active workspace in a project, bound to a Workstation and one or more sandboxes.
  • Sandbox: a gVisor-backed execution environment inside the Workstation.
  • AI Run: an invocation of Claude or Gemini on a project (code edits or analysis).
  • Autosave: background Git snapshots in autosave/... branches.

2.2 REST/GraphQL API shape

Sessions

  • POST /projects/{project_id}/sessions
    • Creates or attaches to a session; returns session_id, workstation info, active sandboxes.
  • GET /projects/{project_id}/sessions/{session_id}
    • Returns current state: Workstation, sandboxes, autosave status, active AI runs.
  • DELETE /projects/{project_id}/sessions/{session_id}
    • Ends session (may leave Workstation running but cleans up sandboxes/autosave processes).

Sandboxes

  • POST /projects/{project_id}/sessions/{session_id}/sandboxes
    • Create sandbox; Django calls CreateSandbox on relevant Agent.
  • POST /projects/{project_id}/sessions/{session_id}/sandboxes/{sandbox_id}/exec
    • Start an exec; returns stream token or WebSocket URL for front-end to attach.
  • DELETE /projects/{project_id}/sessions/{session_id}/sandboxes/{sandbox_id}
    • Destroy sandbox.

Autosave

  • GET /projects/{project_id}/autosave/status
    • Summarizes autosave branches and last snapshot time per user/session.
  • POST /projects/{project_id}/autosave/promote
    • Promotes autosave branch into a feature branch (e.g., squash merge) and opens PR.

AI runs

  • POST /projects/{project_id}/ai/claude-run
    • Body: task description, scope (files), optional session/sandbox IDs.
  • POST /projects/{project_id}/ai/gemini-run
    • Same shape.
  • GET /projects/{project_id}/ai/runs/{run_id}
    • Status, logs, diff summary, linked sandbox execs, and autosave commits.

Each endpoint enforces tenant/project isolation based on JWT once per request at the Django layer, as discussed earlier.

2.3 React TS frontend representation

State model (per user)

  • currentTenant, currentProject.
  • currentSession {id, workstation, sandboxes[], autosaveSummary, aiRuns[]}.
  • Derived UI states:
    • Session status: CONNECTING, ACTIVE, IDLE, ENDED.
    • Sandbox list: each with status (CREATING, RUNNING, IDLE, ERROR), last exec, CPU/mem utilization.
    • Autosave indicator: last snapshot time, branch, and Git status compared to main.
    • AI activity: list of recent runs with statuses (RUNNING, SUCCEEDED, FAILED).

UI affordances

  • “Open Project Session” button → POST /sessions → shows Workstation IDE URL and Coditect panel.
  • “Run in Sandbox” buttons associated with:
    • Test commands (pytest/go test).
    • Build commands.
    • Arbitrary shell commands.
  • A “Sandboxes” pane showing each sandbox’s logs (via WebSocket/stream connection to ExecSandbox forwarded by Django).
  • “Autosave” indicator (e.g., green when autosave branch is healthy and up-to-date, orange when paused due to conflicts).
  • “AI Assist” panel where Claude/Gemini suggestions are tied to AI runs and autosave commits (each suggestion lists the Git commit/branch produced).

Session lifecycle flows

  • Start: user picks project → frontend calls POST /sessions → backend provisions Workstation (if needed) and returns session.
  • Work: user edits code (via Workstation IDE), triggers tests and AI tools through Coditect UI; autosave daemon pushes to Git.
  • Pause/Resume: session can be resumed if Workstation still alive; Coditect uses session ID to rehydrate state (sandboxes, autosave, AI runs).
  • End: session ends, sandboxes are destroyed; Workstation may be stopped or left for reuse based on policy.

If you want, next steps could be:

  • A dedicated Security Design Addendum document outline (sections for threat model, STRIDE analysis, policies per risk) filled in for this architecture.
  • A React TS interface contract (types/interfaces) for Session, Sandbox, AIRun, and AutosaveSummary that you can drop straight into your front-end code.

A dedicated Security Design Addendum document outline (sections for threat model, STRIDE analysis, policies per risk) filled in for this architecture.

A React TS interface contract (types/interfaces) for Session, Sandbox, AIRun, and AutosaveSummary that you can drop straight into your front-end code.

Below is a compact Security Design Addendum outline (pre-filled for Coditect on Cloud Workstations + gVisor) and a React TS interface contract you can paste into your FE codebase.


Security Design Addendum (Coditect + Cloud Workstations + gVisor)

1. Purpose and scope

  • Document security posture and controls for Coditect’s gVisor-based sandboxes running inside Google Cloud Workstations.
  • Extend SDD/TDD with explicit threat model, STRIDE analysis, and policies for sandboxed agent execution.408409410411412413414

Applies to:

  • Django control plane + React TS frontend.
  • Workstations VMs and Coditect Agent.
  • gVisor runsc sandboxes for untrusted agent code.
  • Git autosave and AI (Claude/Gemini) tooling.

2. Assets

  • Code & data: project repositories, configuration, secrets in .coditect/, autosave branches.
  • Identity & auth: JWTs, refresh tokens, user/tenant/project mappings.
  • Infra: Workstations VMs, gVisor runtime, Coditect Agents, Control Plane, DB, logs.
  • Third-party credentials: GitHub tokens, AI model keys, any per-tenant API keys.

3. Trust boundaries

  • Browser ↔ Django: HTTPS, JWT-based auth; browser untrusted.
  • Django ↔ Workstation Agent: gRPC over mTLS; only Coditect Control Plane may call agents.415416
  • Agent ↔ sandbox: Docker + gVisor runsc runtime; sandbox is untrusted code, separated from host kernel.410411414417418
  • Workstation VM ↔ GCP project: hypervisor isolation; Workstations managed by Cloud Workstations controller.409419408

4. STRIDE analysis (per threat, with mitigations/policies)

4.1 Spoofing

Risks:

  • Attacker impersonates a user or control-plane service.
  • Rogue client tries to talk directly to Workstation Agent.

Mitigations:

  • User auth: SSO/OIDC → short-lived access JWTs; refresh tokens in HTTP-only cookies; per-tenant RBAC enforced server-side.420421
  • Service auth: mTLS between Control Plane and Agents, with CA-issued certs and SANs (spiffe://coditect/control-plane vs .../workstation/<id>).416415
  • Agents reject any non-mTLS or invalid cert; only accept control-plane CN/SAN.

Policies:

  • Tokens: access tokens ≤30 min; refresh tokens revocable server-side.
  • Regular rotation of certs and JWT signing keys via KMS/CA.

4.2 Tampering

Risks:

  • Malicious sandbox modifies files outside workspace or tampers with other sandboxes.
  • Attacker alters logs or audit records.

Mitigations:

  • gVisor sandbox: untrusted code runs with --runtime=runsc, read-only rootfs, only /workspace volume mounted.414417418422410
  • No hostPath or Docker socket mounts; each sandbox has its own container filesystem.
  • Central, append-only audit log in Control Plane; sandboxes cannot access it.

Policies:

  • All sandbox containers must use a hard-coded runsc runtime; no fall-back to runc for untrusted workloads.
  • Control Plane rejects any attempt to run execs on containers not labeled as Coditect-owned sandboxes.

4.3 Repudiation

Risks:

  • Users deny having run specific code or AI actions; incidents lack attribution.

Mitigations:

  • Detailed audit logs: user_id, tenant_id, project_id, sandbox_id, Workstation ID, exec commands, AI tool used, timestamps.
  • AI runs tied to autosave commits and Git author metadata.

Policies:

  • Audit events are immutable, stored in an append-only log or WORM-capable storage.
  • Any admin action manipulating sandboxes or Workstations is logged with actor ID.

4.4 Information disclosure

Risks:

  • Sandbox reads secrets or code belonging to other projects/tenants.
  • Sandbox exfiltrates data over network.

Mitigations:

- Workspace isolation: each sandbox only mounts `/workspaces/<tenant>/<user>/<project>` and ephemeral scratch.  
  • No global filesystem or /home mount; no cloud metadata access.
  • Default network=none or strict outbound allowlist with egress proxy.419422423
  • Secrets kept out of sandbox: GitHub tokens, AI keys live in Control Plane; any external calls happen via backend, not directly from sandbox.

Policies:

  • Any network-enabled sandbox is tied to project policy and tenant risk level; logs of outbound requests with rate limits.
  • No direct DB or internal service endpoints exposed in sandbox environment.

4.5 Denial of Service

Risks:

  • Infinite loops / CPU bombs.
  • Memory bombs, fork bombs.
  • Port scans or outbound floods.

Mitigations:

  • gVisor with cgroup CPU/mem limits per container; enforced timeout_seconds and idle_timeout_seconds.424425426
  • Quota service: per-tenant limits on concurrent sandboxes, vCPU-seconds, memory, and networked sandbox count.
  • Abuse heuristics:
    • CPU >80% for >N seconds with no output → flagged.
    • Repeated OOMs / process restarts → auto-kill and cool-down.
    • Outbound connection patterns matching port scans → immediate kill, tenant throttling.

Policies:

  • Sandbox is auto-terminated upon exceeding CPU-time/memory or triggering heuristics; tenant may be temporarily banned from new sandboxes based on configurable thresholds.

4.6 Elevation of privilege

Risks:

  • Sandbox escapes gVisor to host Workstation.
  • Compromised Workstation tries to impersonate Control Plane.

Mitigations:

  • gVisor: user-space kernel intercepting syscalls, reducing host attack surface.411412413410414
  • Each Workstation runs under a restricted service account with minimal GCP IAM permissions.
  • Control Plane authenticates agent identity via cert SANs and uses allowlists of agent IDs; compromised Workstation cannot impersonate another.415416

Policies:

  • Regularly update gVisor per production guide and CVEs.410411
  • Periodic security scans of Workstation images; limit installed tooling.

React TS interface contract

You can drop this into a types/session.ts or similar.

// Session, Sandbox, AI Run, AutosaveSummary types for Coditect FE

export type SessionStatus = 'CONNECTING' | 'ACTIVE' | 'IDLE' | 'ENDING' | 'ENDED' | 'ERROR';

export interface Session {
id: string;
tenantId: string;
projectId: string;
userId: string;

workstationId: string;
workstationName?: string;
workstationRegion?: string;

status: SessionStatus;
createdAt: string; // ISO 8601
lastActiveAt: string; // ISO 8601

sandboxes: Sandbox[];
autosave: AutosaveSummary | null;
aiRuns: AIRunSummary[];
}

export type SandboxStatus = 'CREATING' | 'RUNNING' | 'IDLE' | 'TERMINATING' | 'TERMINATED' | 'FAILED';

export interface Sandbox {
id: string;
sessionId: string;
tenantId: string;
projectId: string;
userId: string;

image: string;
runtime: 'gvisor';
cpuLimit: number; // cores
memoryLimitMiB: number;

status: SandboxStatus;
createdAt: string;
lastExecAt?: string;

// Live metrics snapshot (optional, can be filled from metrics stream)
metrics?: SandboxMetrics;
}

export interface SandboxMetrics {
cpuSeconds: number;
cpuPercent?: number;
memoryPeakMiB: number;
bytesRead: number;
bytesWritten: number;
lastUpdatedAt: string;
}

export type AIRunStatus = 'QUEUED' | 'RUNNING' | 'SUCCEEDED' | 'FAILED' | 'CANCELLED';

export interface AIRunSummary {
id: string;
sessionId: string;
projectId: string;
userId: string;

modelProvider: 'claude' | 'gemini';
modelName: string;

status: AIRunStatus;
createdAt: string;
completedAt?: string;

// human-readable summary for UI
title: string;
description?: string;

// link to code changes produced
autosaveBranch?: string;
autosaveCommitId?: string;
}

export interface AutosaveSummary {
tenantId: string;
projectId: string;
userId: string;
ticketId?: string;

branchName: string; // e.g. autosave/PROJ-1234/acme/foo/alice/ws-01/20260114
lastCommitId: string;
lastCommitAt: string;

// is autosave loop healthy
status: 'HEALTHY' | 'PAUSED' | 'ERROR';

// optional human description for UI
statusMessage?: string;
}

This gives you:

  • Session as the top-level unit tying together Workstation, sandboxes, autosave, and AI runs.
  • Sandbox as a gVisor-backed execution unit with metrics.
  • AIRunSummary for UI lists and details panes.
  • AutosaveSummary for status indicators and promotion flows.

Sandboxes

In addition to the Function interface, Modal has a direct interface for defining containers at runtime and securely running arbitrary code inside them. This can be useful if, for example, you want to: Execute code generated by a language model. Create isolated environments for running untrusted code. Check out a git repository and run a command against it, like a test suite, or npm lint. Run containers with arbitrary dependencies and setup scripts. Each individual job is called a Sandbox and can be created using the Sandbox.create constructor: PythonJavaScriptGo import modal

app = modal.App.lookup("my-app", create_if_missing=True)

sb = modal.Sandbox.create(app=app)

p = sb.exec("python", "-c", "print('hello')", timeout=3) print(p.stdout.read())

p = sb.exec("bash", "-c", "for i in {1..10}; do date +%T; sleep 0.5; done", timeout=5) for line in p.stdout: # Avoid double newlines by using end="". print(line, end="")

sb.terminate() Copy Note: you can run the above example as a script directly with python my_script.py. modal run is not needed here since there is no entrypoint. Sandboxes require an App to be passed when spawned from outside of a Modal container. You may pass in a regular App object or look one up by name with App.lookup. The create_if_missing flag on App.lookup will create an App with the given name if it doesn’t exist. Lifecycle  Timeouts  Sandboxes have a default maximum lifetime of 5 minutes. You can change this by passing a timeout of up to 24 hours to the Sandbox.create(...) function. PythonJavaScriptGo sb = modal.Sandbox.create(app=my_app, timeout=10*60) # 10 minutes Copy If you need a Sandbox to run for more than 24 hours, we recommend using Filesystem Snapshots to preserve its state, and then restore from that snapshot with a subsequent Sandbox. Idle Timeouts  Sandboxes can also be automatically terminated after a period of inactivity - you can do this by setting the idle_timeout parameter. A Sandbox is considered active if any of the following are true: It has an active command running (via sb.exec(...)) Its stdin is being written to (via sb.stdin.write()) It has an open TCP connection over one of its Tunnels Configuration  Sandboxes support nearly all configuration options found in regular modal.Functions. Refer to Sandbox.create for further documentation on Sandbox configs. For example, Images and Volumes can be used just as with functions: PythonJavaScriptGo sb = modal.Sandbox.create( image=modal.Image.debian_slim().pip_install("pandas"), volumes={"/data": modal.Volume.from_name("my-volume")}, workdir="/repo", app=my_app, ) Copy Environments  Environment variables  You can set environment variables using inline secrets: PythonJavaScriptGo secret = modal.Secret.from_dict({"MY_SECRET": "hello"})

sb = modal.Sandbox.create( secrets=[secret], app=my_app, ) p = sb.exec("bash", "-c", "echo $MY_SECRET") print(p.stdout.read()) Copy Custom Images  Sandboxes support custom images just as Functions do. However, while you’ll typically invoke a Modal Function with the modal run cli, you typically spawn a Sandbox with a simple script call. As such, you may need to manually enable output streaming to see your image build logs: PythonJavaScriptGo image = modal.Image.debian_slim().pip_install("pandas", "numpy")

with modal.enable_output(): sb = modal.Sandbox.create(image=image, app=my_app) Copy Dynamically defined environments  Note that any valid Image or Mount can be used with a Sandbox, even if those images or mounts have not previously been defined. This also means that Images and Mounts can be built from requirements at runtime. For example, you could use a language model to write some code and define your image, and then spawn a Sandbox with it. Check out devlooper for a concrete example of this. Running a Sandbox with an entrypoint  In most cases, Sandboxes are treated as a generic container that can run arbitrary commands. However, in some cases, you may want to run a single command or script as the entrypoint of the Sandbox. You can do this by passing command arguments to the Sandbox constructor: PythonJavaScriptGo sb = modal.Sandbox.create("python", "-m", "http.server", "8080", app=my_app, timeout=10) for line in sb.stdout: print(line, end="") Copy This functionality is most useful for running long-lived services that you want to keep running in the background. See our Jupyter notebook example for a more concrete example of this. Referencing Sandboxes from other code  If you have a running Sandbox, you can retrieve it using the from_id method. PythonJavaScriptGo sb = modal.Sandbox.create(app=my_app) sb_id = sb.object_id

... later in the program ...

sb2 = modal.Sandbox.from_id(sb_id)

p = sb2.exec("echo", "hello") print(p.stdout.read()) sb2.terminate() Copy A common use case for this is keeping a pool of Sandboxes available for executing tasks as they come in. You can keep a list of object_ids of Sandboxes that are “open” and reuse them, closing over the object_id in whatever function is using them. Logging  You can see Sandbox execution logs using the verbose option. For example: PythonJavaScriptGo sb = modal.Sandbox.create(app=my_app, verbose=True)

p = sb.exec("python", "-c", "print('hello')") print(p.stdout.read())

with sb.open("test.txt", "w") as f: f.write("Hello World\n") Copy shows Sandbox logs: Sandbox exec started: python -c print('hello') Opened file 'test.txt': fd-yErSQzGL9sig6WAjyNgTPR Wrote to file: fd-yErSQzGL9sig6WAjyNgTPR Closed file: fd-yErSQzGL9sig6WAjyNgTPR Copy Named Sandboxes  You can assign a name to a Sandbox when creating it. Each name must be unique within an app - only one running Sandbox can use a given name at a time. Note that the associated app must be a deployed app. Once a Sandbox completely stops running, its name becomes available for reuse. Some applications find Sandbox Names to be useful for ensuring that no more than one Sandbox is running per resource or project. If a Sandbox with the given name is already running, create() will raise an error. PythonJavaScriptGo sb1 = modal.Sandbox.create(app=my_app, name="my-name")

This will raise a modal.exception.AlreadyExistsError.

sb2 = modal.Sandbox.create(app=my_app, name="my-name") Copy A named Sandbox may be fetched from a deployed app using from_name() but only if the Sandbox is currently running. If no running Sandbox is found, from_name() will raise an error. PythonJavaScriptGo my_app = modal.App.lookup("my-app", create_if_missing=True) sb1 = modal.Sandbox.create(app=my_app, name="my-name")

Returns the currently running Sandbox with the name "my-name" from the

deployed app named "my-app".

sb2 = modal.Sandbox.from_name("my-app", "my-name") assert sb1.object_id == sb2.object_id # sb1 and sb2 refer to the same Sandbox Copy Sandbox Names may contain only alphanumeric characters, dashes, periods, and underscores, and must be shorter than 64 characters. Tagging  Sandboxes can also be tagged with arbitrary key-value pairs. These tags can be used to filter results in Sandbox.list. PythonJavaScriptGo sandbox_v1_1 = modal.Sandbox.create("sleep", "10", app=my_app) sandbox_v1_2 = modal.Sandbox.create("sleep", "20", app=my_app)

sandbox_v1_1.set_tags({"major_version": "1", "minor_version": "1"}) sandbox_v1_2.set_tags({"major_version": "1", "minor_version": "2"})

for sandbox in modal.Sandbox.list(app_id=my_app.app_id): # All sandboxes. print(sandbox.object_id)

for sandbox in modal.Sandbox.list( app_id=my_app.app_id, tags={"major_version": "1"}, ): # Also all sandboxes. print(sandbox.object_id)

for sandbox in modal.Sandbox.list( app_id=app.app_id, tags={"major_version": "1", "minor_version": "2"}, ): # Just the latest sandbox. print(sandbox.object_id)

Modal’s Sandbox API gives a very clear, high-level contract for “containers defined at runtime” that Coditect’s gVisor-based Workstation model should emulate conceptually, even though the underlying implementation is different.427428429430

Here’s how to translate the key ideas into Coditect’s architecture you’ve been designing:


1. Core conceptual mapping

Modal concept → Coditect equivalent:

  • modal.App (deployment context, config namespace) → Coditect Project (and Tenant): holds sandbox policies, images, volumes, secrets, tags.
  • Sandbox.create(app=..., ...)POST /projects/{project_id}/sandboxes → Control Plane chooses Workstation + Workstation Agent → docker run --runtime=runsc ... gVisor container.428429431427
  • sb.exec(...) returning process handles and streams → POST /sandboxes/{id}/exec (REST) + gRPC ExecSandbox stream between Control Plane and Agent, with stdout/stderr and exit info.432433434
  • Timeouts / idle timeouts (5 min default, up to 24h) → Coditect sandbox-level timeout_seconds + idle_timeout_seconds enforced by Agent and tracked centrally; gVisor container is killed by Control Plane when limits exceeded.429435436
  • Images, volumes, workdir, env, secrets → Image selection and volume mounts in CreateSandboxRequest (Docker+runsc args); environment variables plus secret injection handled by Agent but sourced from tenant-scoped secret stores.431428429
  • Named sandboxes (name=, uniqueness per app) and from_name → Named Coditect sandboxes (unique sandbox_name per project) and lookups via Control Plane DB, not direct to the runtime.
  • Tags and Sandbox.list(tags=...) → Labels/metadata on Coditect sandbox records, filterable via Control Plane API (and eventually CLI/UI).

2. Patterns worth explicitly copying

Even though Modal is closed-source, its behavior model is a good blueprint:

  • Short-lived, time-bounded sandboxes (5 min default, up to 24h) with explicit timeout and idle_timeout:
    • Coditect’s API should require both, with defaults and hard caps, and enforce them via Agent + Control Plane.
  • Exec as primary interaction:
    • Keep Coditect’s sandbox interface centered on exec commands (one or many) rather than building a full “VM” abstraction.
  • Image + runtime-defined dependencies:
    • Modal’s dynamic image building (e.g., Image.debian_slim().pip_install("pandas")) maps to a build pipeline in Coditect (Dockerfile builder, Nix, etc.); you can accept high-level “requirements” and use an internal builder to create images cached per project.436428
  • Secrets as env injection:
    • Use project/tenant-scoped secret references that the Agent resolves, never push raw secrets into the sandbox API surface.
  • Named and tagged sandboxes:
    • Names for uniqueness (“only one Jupyter/HTTP server per project”), tags for arbitrary attributes (version, tool type, ticket).

3. Where Coditect diverges (by design)

  • Runtime: Coditect uses gVisor (runsc) inside VMs (Workstations) rather than a proprietary serverless microVM fleet; isolation is still improved over plain containers, but you control the whole life-cycle from inside your Workstations.430437438427428429
  • Control plane: Django + React + gRPC, not Modal’s infra; you can tailor quotas, RBAC, and multi-tenant policies tightly to your SaaS model.
  • Autosave and Git integration: Coditect’s autosave branches and worktrees are first-class; each sandbox’s work is automatically reflected in Git in near-real time, which Modal does not provide out of the box.439440441442
  • Security policy layer: you’re layering explicit abuse detection (CPU, network, filesystem) and tenant-level throttling on top of the sandbox abstraction, using metrics the Agent and Workstations can expose.435443444445446447

4. Suggested next focused spec

If you want to turn this Modal-Sandbox mental model directly into Coditect docs/specs, a good next artifact would be:

“Coditect Sandbox API: a Modal-Sandbox–inspired spec for Create/Exec/Terminate, including timeout/idle semantics, named sandboxes, tags, and a clear mapping to gVisor containers on Cloud Workstations.”

That spec would:

  • Enumerate fields like image, volumes, env, secrets, timeout, idle_timeout, name, tags, and workdir.
  • Define exact behaviors for timeouts, idle detection, and reattachment (like from_id / from_name).
  • Tie them to the Workstation Agent’s gRPC API and to your RBAC/tenant model.

turn this Modal-Sandbox mental model directly into Coditect docs/specs, a good next artifact would be:

“Coditect Sandbox API: a Modal-Sandbox–inspired spec for Create/Exec/Terminate, including timeout/idle semantics, named sandboxes, tags, and a clear mapping to gVisor containers on Cloud Workstations.” That spec would: Enumerate fields like image, volumes, env, secrets, timeout, idle_timeout, name, tags, and workdir. Define exact behaviors for timeouts, idle detection, and reattachment (like from_id / from_name). Tie them to the Workstation Agent’s gRPC API and to your RBAC/tenant model.

Below is a “Coditect Sandbox API” spec, directly modeled on Modal Sandboxes but mapped to your gVisor-on-Workstations implementation, RBAC, and gRPC agent.


1. Overview

Coditect Sandboxes provide runtime-defined containers inside Google Cloud Workstations, backed by gVisor (runsc), for executing untrusted or model-generated code per tenant/project.448449450451452

Core capabilities:

  • Create an isolated sandbox (container) with configurable image, volumes, env, secrets, tags, timeout, idle_timeout, workdir, and optional name.
  • Execute commands (exec) with interactive stdin, streamed stdout/stderr, and timeouts.453454455456
  • Terminate sandboxes cleanly or forcefully.
  • Reattach to running sandboxes by id or name within a project.

All calls are project-scoped, tenant-isolated via JWT+RBAC, and enforced by the Django control plane.


2. REST API surface (high level)

Base path: /v1/projects/{project_id}

2.1 Create sandbox

POST /v1/projects/{project_id}/sandboxes

Request body

{
"name": "optional-unique-name",
"image": "ghcr.io/coditect/runtime:py310",
"volumes": {
"/workspace": "project-workspace",
"/data": "project-datasets-ro"
},
"workdir": "/workspace",
"env": {
"PYTHONUNBUFFERED": "1"
},
"secrets": [
"hf-token-readonly"
],
"timeout_seconds": 300,
"idle_timeout_seconds": 60,
"tags": {
"ticket": "PROJ-1234",
"tool": "tests",
"tenant": "acme"
},
"runtime": "gvisor"
}

Fields

  • name (optional):
    • Unique per project among running sandboxes.
    • Characters: [A-Za-z0-9._-], length < 64.
    • If a running sandbox with this name exists, creation fails with 409 AlreadyExists.
  • image: container image (string).
- `volumes`: map mountPath → volume ID; e.g. `project-workspace`→ `/workspaces/<tenant>/<user>/<project>`.  
  • workdir: default working directory for execs.
  • env: static environment variables.
  • secrets: secret reference IDs (resolved by control plane, not by agent).
  • timeout_seconds: max wall-clock lifetime, 1–86400 (1–86400 seconds, default 300).
  • idle_timeout_seconds: idle timeout (no active exec/stdin/tunnel) before auto-termination.
  • tags: arbitrary key/value strings, used for filtering and audit.
  • runtime: must be "gvisor" for this implementation.

Response

{
"id": "sb-uuid",
"project_id": "proj-uuid",
"tenant_id": "tenant-uuid",
"name": "optional-unique-name",
"image": "ghcr.io/coditect/runtime:py310",
"status": "RUNNING",
"created_at": "2026-01-15T07:51:00Z",
"timeout_seconds": 300,
"idle_timeout_seconds": 60,
"tags": {
"ticket": "PROJ-1234",
"tool": "tests"
}
}

2.2 Exec in sandbox

POST /v1/projects/{project_id}/sandboxes/{sandbox_id}/exec

Request

{
"command": ["python", "-m", "pytest"],
"workdir": "/workspace",
"timeout_seconds": 120,
"pty": false
}

Behavior

  • Creates a logical exec session under the sandbox.
  • Control plane opens a gRPC ExecSandbox stream to the Agent and returns a stream token or WebSocket URL to the client.454455456453

Response

{
"exec_id": "exec-uuid",
"stream_url": "wss://api.coditect.io/v1/streams/exec/exec-uuid"
}

Client then:

  • Connects to stream_url.
  • Sends stdin chunks and control messages (if interactive).
  • Receives stdout/stderr and exit events in a JSON or protobuf-encoded stream.

2.3 Terminate sandbox

DELETE /v1/projects/{project_id}/sandboxes/{sandbox_id}

  • Gracefully terminates the sandbox (SIGTERM then SIGKILL); cleans up container.
  • Returns 204 on success.

2.4 Lookup by name / list with tags

  • GET /v1/projects/{project_id}/sandboxes?name=foo
    • Returns currently running sandbox with that name or 404.
  • GET /v1/projects/{project_id}/sandboxes?tags[ticket]=PROJ-1234&tags[tool]=tests
    • Returns list of matching sandboxes (aggregated from control-plane DB).

This mirrors Sandbox.from_name and Sandbox.list(tags=...) semantics.


3. Timeout and idle semantics

3.1 Total timeout

  • timeout_seconds defines the max wall-clock lifetime from sandbox creation to termination.
  • Enforced by Control Plane and Agent:
    • Control Plane stores created_at + timeout_seconds as deadline_at.
    • Background janitor / per-sandbox timer kills the sandbox when now ≥ deadline.

3.2 Idle timeout

A sandbox is considered active if any of:

  • An ExecSandbox session is running (ExecStatus not DONE/ERROR).
  • There is recent stdin activity (last ExecStdin within N seconds).
  • (Future) There is an open tunnel connection.

If none of the above for idle_timeout_seconds, Control Plane marks sandbox as idle and:

  • Sends a DestroySandbox to Agent.
  • Marks status TERMINATED with reason IDLE_TIMEOUT.

3.3 Exec-level timeout

  • timeout_seconds passed in exec request acts as per-command timeout.
  • Sandbox staying alive but the exec process is killed when exec deadline is reached.
  • Exec exit event: exit_code non-zero with reason = "TIMEOUT".

4. Reattachment semantics (from_id / from_name)

4.1 by id

  • GET /v1/projects/{project_id}/sandboxes/{sandbox_id}
    • Returns sandbox metadata if still running; 404 if terminated or never existed.
  • POST /v1/projects/{project_id}/sandboxes/{sandbox_id}/exec
    • Always allowed if sandbox status == RUNNING and within timeout.

4.2 by name

  • Names are unique among running sandboxes per project.
  • GET /v1/projects/{project_id}/sandboxes?name=<name> → returns the running sandbox with that name or 404.

This provides a from_name analog without exposing the underlying runtime’s object IDs.


5. Mapping to Workstation Agent gRPC

The REST API above is implemented via the Workstation Agent gRPC service defined earlier.455456457453

  • POST /sandboxesSandboxAgent.CreateSandbox(CreateSandboxRequest)
    • Maps:
      • image → Docker image.
      • volumes-v mount flags, workspace paths.
      • env-e env flags.
      • timeout_seconds → stored in Control Plane, not passed directly to Docker.
      • idle_timeout_seconds → stored in Control Plane (timer logic).
  • POST /sandboxes/{id}/execSandboxAgent.ExecSandbox (bidirectional stream):
    • Control Plane initiates stream, sends ExecStart (command, workdir, timeout, pty), then handles stdin/ctrl.
    • Agent streams stdout/stderr/events.
  • DELETE /sandboxes/{id}SandboxAgent.DestroySandbox:
    • docker rm -f coditect-sb-<sandbox_id> with runsc runtime.458459448

RBAC and tenancy:

  • Django reads JWT, performs tenant/project RBAC checks once per REST request, and only then calls Agent gRPC with tenant_id, project_id, user_id in metadata.
  • Agent does not perform multi-tenant checks; it trusts the Control Plane identity via mTLS, but uses the metadata for logging and metrics attribution.460461462

6. RBAC and multi-tenant constraints

  • Only roles with sandbox:create may call POST /projects/{project_id}/sandboxes.
  • Only roles with sandbox:execute may exec; sandbox:destroy to terminate.
  • name and tags are always scoped to (tenant_id, project_id); two tenants can have sandboxes named my-name without conflict.
  • Quotas (per tenant/project) enforced at Control Plane before CreateSandbox:
    • Max concurrent sandboxes.
    • Max total vCPU/memory for running sandboxes.

7. Example JSON schemas (abridged)

Sandbox

{
"id": "sb-uuid",
"project_id": "proj-uuid",
"tenant_id": "tenant-uuid",
"name": "jupyter-dev",
"image": "ghcr.io/coditect/runtime:py310",
"runtime": "gvisor",
"status": "RUNNING",
"created_at": "2026-01-15T07:51:00Z",
"timeout_seconds": 7200,
"idle_timeout_seconds": 300,
"tags": {
"ticket": "PROJ-1234",
"tool": "jupyter"
}
}

Exec event stream (over WebSocket)

{ "type": "status", "phase": "STARTING", "message": "exec started" }
{ "type": "stdout", "data": "Running tests...\n" }
{ "type": "stderr", "data": "warning: deprecation\n" }
{ "type": "exit", "exit_code": 0, "reason": "NORMAL" }

This spec gives you a clear, Modal-like contract at the Coditect API layer while remaining faithful to your gVisor-on-Workstations implementation, gRPC agent model, and RBAC/tenancy requirements. 463464465466467468469470471


Draft API endpoints and payloads for Create/Exec/Terminate sandbox

Specify timeout and retry semantics for sandbox operations Define authorization and RBAC for sandbox API calls Describe observability metrics and logging for sandboxes List error codes and failure handling behaviors for runsc sandboxes

Below is a concise API spec you can drop into Coditect docs.


1. REST API endpoints and payloads

Base: /v1/projects/{project_id} All requests require Authorization: Bearer <JWT> with tenant/project claims.

1.1 Create sandbox

POST /v1/projects/{project_id}/sandboxes

Request body

{
"name": "optional-unique-name",
"image": "ghcr.io/coditect/runtime:py310",
"volumes": {
"/workspace": "project-workspace",
"/data": "datasets-ro"
},
"workdir": "/workspace",
"env": {
"PYTHONUNBUFFERED": "1"
},
"secrets": [
"hf-token-readonly"
],
"timeout_seconds": 300,
"idle_timeout_seconds": 60,
"tags": {
"ticket": "PROJ-1234",
"tool": "tests"
},
"runtime": "gvisor"
}

Response 201

{
"id": "sb-uuid",
"project_id": "proj-uuid",
"tenant_id": "tenant-uuid",
"name": "optional-unique-name",
"image": "ghcr.io/coditect/runtime:py310",
"runtime": "gvisor",
"status": "RUNNING",
"created_at": "2026-01-15T07:52:00Z",
"timeout_seconds": 300,
"idle_timeout_seconds": 60,
"tags": {
"ticket": "PROJ-1234",
"tool": "tests"
}
}

1.2 Exec in sandbox

POST /v1/projects/{project_id}/sandboxes/{sandbox_id}/exec

Request

{
"command": ["python", "-m", "pytest"],
"workdir": "/workspace",
"timeout_seconds": 120,
"pty": false
}

Response 200

{
"exec_id": "exec-uuid",
"stream_url": "wss://api.coditect.io/v1/streams/exec/exec-uuid"
}
  • Client connects to stream_url for bidirectional streaming of stdin/stdout/stderr (via WebSocket or SSE proxying gRPC).472473474475

1.3 Terminate sandbox

DELETE /v1/projects/{project_id}/sandboxes/{sandbox_id}

Response

  • 204 No Content on success.

Optional: force=true query param for immediate SIGKILL semantics.


2. Timeout and retry semantics

2.1 Sandbox lifetime

  • timeout_seconds (create request): max wall-clock lifetime from creation to termination.
    • Range: 1–86400 seconds; default 300.
    • Enforced by Control Plane timers and Agent; when exceeded, sandbox is killed and marked TERMINATED with reason TIMEOUT.

2.2 Idle timeout

  • idle_timeout_seconds: max period with no activity (no running exec, no stdin, no tunnel) before auto-terminate.
    • If idle period exceeds this, sandbox is terminated and marked TERMINATED with reason IDLE_TIMEOUT.

2.3 Exec timeout

  • timeout_seconds in exec request: max duration for that command.
    • If exceeded: command killed; sandbox remains; exec result has reason = "TIMEOUT".

2.4 Retries (API-level)

  • Create
    • Clients should treat 429 / 503 as transient and may retry with exponential backoff (jittered), but not automatically on 4xx other than 409 (name conflict).
  • Exec
    • If exec call fails before a stream is established (connection error), client may retry once; after ExecStart has been accepted, treat stream errors as terminal and surface to user.473476
  • Terminate
    • Idempotent: repeated DELETE on already-terminated sandbox returns 204.

3. Authorization and RBAC

All sandbox endpoints are project-scoped and require JWT with tenant_id and project_roles.

3.1 Required roles

Per project:

  • POST /projects/{project_id}/sandboxessandbox:create
  • POST /projects/{project_id}/sandboxes/{sandbox_id}/execsandbox:execute
  • DELETE /projects/{project_id}/sandboxes/{sandbox_id}sandbox:destroy
  • GET /projects/{project_id}/sandboxes*sandbox:view

Typical role mapping:

  • project_admin{sandbox:create, sandbox:execute, sandbox:destroy, sandbox:view}
  • developer{sandbox:create, sandbox:execute, sandbox:view}
  • viewer{sandbox:view}

3.2 Enforcement

On each request:

  1. Django decodes JWT, validates signature/expiry, and extracts tenant_id, project_roles.477478
  2. Verifies that project_id in path belongs to tenant_id in token.
  3. Checks required permission for the endpoint against project_roles[project_id].
  4. Only then calls Workstation Agent gRPC with tenant_id, project_id, user_id in metadata over mTLS.479480481

No direct user calls to agents are allowed.


4. Observability: metrics and logging

4.1 Metrics (per sandbox and per tenant)

Collected via Agent and ReportMetrics stream, then exported (Prometheus/Cloud Monitoring):482483484

Per-sandbox:

  • sandbox_cpu_seconds_total{sandbox_id,tenant_id,project_id}
  • sandbox_memory_peak_mib{sandbox_id,...}
  • sandbox_io_bytes_read_total{...}
  • sandbox_io_bytes_written_total{...}
  • sandbox_executions_total{sandbox_id,...,status="success|failure|timeout"}
  • sandbox_lifetime_seconds{sandbox_id,...}

Per tenant/project:

  • Aggregated CPU seconds, memory usage, and exec counts for quota enforcement and cost reporting.

4.2 Logs

  • All stdout/stderr chunks streamed through ExecSandbox are optionally mirrored to a central logging system with labels:
    • tenant_id, project_id, sandbox_id, exec_id, workstation_id.
  • Sandbox lifecycle events (create/destroy/timeout/idle-kill) and errors logged as structured events.

4.3 Audit log

For each sandbox action:

{
"event_type": "sandbox.exec",
"timestamp": "2026-01-15T07:53:00Z",
"tenant_id": "t-acme",
"project_id": "proj-foo",
"sandbox_id": "sb-uuid",
"user_id": "user-uuid",
"workstation_id": "ws-01",
"command": ["python", "-m", "pytest"],
"result": {
"status": "success",
"exit_code": 0,
"duration_ms": 10234
},
"resources": {
"cpu_seconds": 4.2,
"memory_peak_mib": 512
}
}

5. Error codes and failure behaviors (runsc/gVisor sandboxes)

Control Plane translates low-level Docker/runsc errors into API-level responses.485486487488

5.1 Create sandbox

  • 400 Bad Request
    • Invalid payload (missing image, invalid timeout range, invalid name).
  • 401 Unauthorized / 403 Forbidden
    • Missing/invalid JWT or insufficient RBAC.
  • 404 Not Found
    • project_id not found or not accessible to tenant.
  • 409 Conflict
    • Sandbox with name already running in this project.
  • 422 Unprocessable Entity
    • Image not found / invalid image / unsupported runtime.
  • 429 Too Many Requests
    • Tenant/project quotas exceeded (concurrent sandboxes or CPU).
  • 500 Internal Server Error
    • Unexpected Docker/runsc error; message logged, generic message returned.

Common runsc-related issues surfaced as 422/500 with obfuscated internal error details.487488489

5.2 Exec

  • 404 Not Found
    • Sandbox ID does not exist or is terminated.
  • 409 Conflict
    • Sandbox not in RUNNING state.
  • 422 Unprocessable Entity
    • Invalid command/args.
  • 504 Gateway Timeout
    • Exec setup exceeded a platform timeout before process start (e.g., Docker hung).

On-stream errors:

  • If process exits due to gVisor or Docker error, client receives an exit event with reason="INTERNAL_ERROR" and exit_code generic.

5.3 Terminate

  • 204 No Content
    • Sandbox terminated or was already gone.
  • 404 Not Found
    • Sandbox never existed (e.g., wrong ID).
  • 500 Internal Server Error
    • Agent failed to remove container; control plane marks sandbox as ERROR and alerts ops.

This spec gives you a crisp, implementable contract for Coditect Sandboxes aligned with your gVisor Workstation architecture, gRPC agent model, and tenant RBAC.


SUGGESTIONS

Three concise directions to deepen this design and make it “build-ready”:


1. Formalize everything into a single “Sandbox RFC”

  • Consolidate the SDD/TDD/ADR + Sandbox API spec into one internal RFC that:
    • Starts with the Modal mental model and explicitly maps each concept to Coditect.
    • Includes finalized OpenAPI 3.1 schema, gRPC proto, and RBAC matrix.
    • Has an implementation plan (phased rollout: “single-tenant pilot → multi-tenant beta → public GA”).
  • Use that RFC as the canonical doc for onboarding new engineers and aligning infra / product.

2. Implementation spike plan

Define 2–3 short spikes with clear goals:

  1. Workstation + gVisor spike
    • Bring up a single Workstation, install runsc, run a minimal agent, and prove:
      • Create/Exec/Terminate work reliably inside gVisor.
      • Basic metrics (CPU/mem) flow back to a dev control plane.
  2. End-to-end sandbox exec demo
    • From React → Django → Agent → gVisor → back to React:
      • Run pytest in a sandbox on a small demo repo.
      • Show streamed logs and autosave commits hitting a GitHub test org.
  3. Autosave + AI run integration
    • Wire a simple Claude/Gemini “run tests and fix failures” flow that:
      • Modifies code, runs sandboxed tests, and creates autosave commits.

Each spike should produce working code plus a short retro doc capturing surprises (especially gVisor quirks).


3. Hardening & operability checklist

Before committing this as the core Coditect runtime, line up a short checklist:

  • Compatibility matrix: which languages/tools are officially supported under gVisor (Python, Node, Go, Rust toolchains) and known caveats.
  • Abuse thresholds: concrete default limits for CPU, mem, network, sandboxes per tenant, and clear docs on what gets auto-killed vs just logged.
  • Runbooks:
    • “Sandbox stuck / cannot be terminated.”
    • “Workstation agent unreachable.”
    • “Tenant hits abuse heuristics repeatedly.”
  • Feature flags:
    • Ability to toggle gVisor vs runc per tenant/project for debugging.
    • Ability to roll back sandbox features without breaking dev workflows.

If you share which part you want to productize first (e.g., “test runner as a service” vs “full REPL/dev shell”), the next iteration can be a very focused build plan for that slice.


1. Formalize everything into a single “Sandbox RFC”

Consolidate the SDD/TDD/ADR + Sandbox API spec into one internal RFC that: Starts with the Modal mental model and explicitly maps each concept to Coditect. Includes finalized OpenAPI 3.1 schema, gRPC proto, and RBAC matrix. Has an implementation plan (phased rollout: “single-tenant pilot → multi-tenant beta → public GA”). Use that RFC as the canonical doc for onboarding new engineers and aligning infra / product.

Below is a tight skeleton you can treat as the Sandbox RFC v1 and extend as needed.


1. Title, authors, status

  • Title: Coditect Sandbox Runtime on Google Cloud Workstations
  • Authors: <names>
  • Status: Draft v1
  • Reviewers: Platform, Infra, Security, Frontend
  • Date: 2026‑01‑15

2. Overview & goals

  • Goal: Provide a Modal‑like Sandbox API and runtime for executing untrusted and AI‑generated code per tenant/project, implemented on Google Cloud Workstations using gVisor (runsc) containers.490491492493
  • Non‑goals:
    • Providing general serverless compute to external services.
    • Replacing all CI/CD; initial focus is dev/test workflows and AI agents.

3. Mental model: Modal → Coditect mapping

Modal conceptCoditect equivalent
AppTenant + Project configuration namespace
Sandbox.create()POST /projects/{project_id}/sandboxes
sb.exec()POST /.../sandboxes/{id}/exec + gRPC ExecSandbox
timeout, idle_timeoutSandbox lifetime + idle timers in Control Plane
Named sandboxesname field, unique per project while running
Tagstags map on sandbox metadata
from_id / from_nameGET /sandboxes/{id} + GET /sandboxes?name=

Under the hood, Modal uses proprietary infra; Coditect uses gVisor containers inside Workstations VMs, orchestrated via a Workstation Agent gRPC service.492493494495496


4. Architecture summary (SDD in brief)

  • Control Plane (Django + React)
    • Exposes /v1/projects/{id}/sandboxes REST API.
    • Handles JWT auth, tenant/project RBAC, quotas, and audit logs.
    • Maintains sandbox metadata and routes calls to Workstations.
  • Google Cloud Workstations
    • Per‑team Workstation VMs in our GCP projects.491490
    • Each VM runs Docker/containerd with runsc installed as an alternative runtime (--runtime=runsc).493497498
  • Coditect Agent (per Workstation)
    • gRPC server implementing SandboxAgent (CreateSandbox, ExecSandbox, DestroySandbox, ReportMetrics).499500501502
    • On Create: docker run --runtime=runsc with mounts/env/limits.494495492493
    • On Exec: docker exec wrapped in gRPC bidirectional stream; forwards stdout/stderr, handles stdin and timeouts.
    • On Destroy: docker rm -f sandbox container.
  • Git + autosave
    • Per user/project/ticket, autosave branches and worktrees push snapshots to GitHub for recovery and review.503504505506

5. API surface (OpenAPI 3.1 sketch)

Paths

  • POST /v1/projects/{project_id}/sandboxes – create sandbox.
  • GET /v1/projects/{project_id}/sandboxes – list, filter by name or tags.
  • GET /v1/projects/{project_id}/sandboxes/{sandbox_id} – details.
  • POST /v1/projects/{project_id}/sandboxes/{sandbox_id}/exec – start exec, return exec_id + stream_url.
  • DELETE /v1/projects/{project_id}/sandboxes/{sandbox_id} – terminate sandbox.

Core schema fields (Create body)

  • name?: string (optional, unique while running)
  • image: string
  • volumes?: { [mountPath: string]: string }
  • workdir?: string
  • env?: { [key: string]: string }
  • secrets?: string[] (secret IDs)
  • timeout_seconds?: number (default 300, max 86400)
  • idle_timeout_seconds?: number (default 60)
  • tags?: { [key: string]: string }
  • runtime?: "gvisor"

Exec request:

  • command: string[]
  • workdir?: string
  • timeout_seconds?: number
  • pty?: boolean

You already have most of this from earlier; the RFC just freezes it as the canonical API.


6. gRPC proto (TDD in brief)

Service

service SandboxAgent {
rpc CreateSandbox(CreateSandboxRequest) returns (CreateSandboxResponse);
rpc ExecSandbox(stream ExecRequest) returns (stream ExecEvent);
rpc DestroySandbox(DestroySandboxRequest) returns (DestroySandboxResponse);
rpc ReportMetrics(stream MetricsReport) returns (MetricsAcknowledge);
}

Key messages

  • CreateSandboxRequest: sandbox_id, tenant_id, project_id, image, workspace_path, cpu_limit, memory_limit_mib, idle_timeout_seconds, env.
  • ExecRequest / ExecEvent: ExecStart (command, workdir, timeout, pty), ExecStdout, ExecStderr, ExecExit, ExecStatus.
  • MetricsReport: sandbox_id, cpu_seconds, memory_peak_mib, bytes_read, bytes_written.

gRPC is mTLS‑protected; Control Plane identity and tenant metadata are carried in TLS and metadata, not user‑originating JWTs.502507508509


7. RBAC matrix

Per project:

RolePermissions
project_admincreate/exec/destroy/list sandboxes; manage quotas
developercreate/exec/list sandboxes
viewerlist/get sandboxes

Permissions:

  • sandbox:createPOST /projects/{id}/sandboxes
  • sandbox:executePOST /projects/{id}/sandboxes/{sb}/exec
  • sandbox:destroyDELETE /projects/{id}/sandboxes/{sb}
  • sandbox:viewGET /projects/{id}/sandboxes*

Tenant owners/admins can adjust project quotas and sandbox policies; RBAC evaluation happens once per request at Django.


8. Timeout, idle, and reattachment semantics

  • timeout_seconds: sandbox max lifetime; Control Plane enforces via deadline timers, Agent kills container when exceeded.
  • idle_timeout_seconds: auto‑terminate when no exec/stdin/tunnel activity for this period.
  • Exec calls have their own per‑exec timeout_seconds, with exit reason TIMEOUT.
  • Reattachment:
    • GET /sandboxes/{id} → lookup by ID (if status == RUNNING).
    • GET /sandboxes?name=... → unique by name per project while running; 404 otherwise.

9. Observability and security policies (summary)

  • Metrics: CPU/mem/I/O per sandbox and aggregated per tenant for quotas.510511512
  • Logs: stdout/stderr streams centralised with sandbox/tenant/project labels.
  • Abuse detection: CPU loops, OOM patterns, network scans; auto‑terminate sandbox and rate‑limit/bucket tenants.513514515
  • gVisor guarantees and limits documented; all untrusted workloads must use runsc.495496516492493494

10. Implementation plan (phased rollout)

Phase 0 – Prototype (single tenant, non‑prod)

  • Target: internal “dogfood” tenant.
  • Deliveries:
    • Single Workstation pool with runsc installed.497517493
    • Minimal SandboxAgent implementation + Control Plane integration for Create/Exec/Terminate.
    • Basic metrics and logs.

Phase 1 – Single‑tenant pilot (prod‑like)

  • Expand to 1–2 real tenants under feature flag.
  • Quotas and RBAC enforced; autosave branches wired in.504505506503
  • Abuse heuristics and kill logic in place (safeguards).

Phase 2 – Multi‑tenant beta

  • Hardened IAM and tenant isolation.518519520
  • Named sandboxes + tags, from_name-style lookups.
  • Per-tenant dashboards for resource usage.

Phase 3 – Public GA

  • Documented API, SLAs, and support runbooks.
  • Backwards compatibility guarantees for Sandbox API.
  • Optional additional runtimes (e.g., Wasm, later microVMs) hiding behind same API.

This RFC skeleton should be the single canonical document in your repo (e.g., rfcs/000-sandbox-runtime.md), with links out to detailed OpenAPI, proto files, and TDD code stubs.


2. Implementation spike plan

Define 2–3 short spikes with clear goals: Workstation + gVisor spike Bring up a single Workstation, install runsc, run a minimal agent, and prove: Create/Exec/Terminate work reliably inside gVisor. Basic metrics (CPU/mem) flow back to a dev control plane. End-to-end sandbox exec demo From React → Django → Agent → gVisor → back to React: Run pytest in a sandbox on a small demo repo. Show streamed logs and autosave commits hitting a GitHub test org. Autosave + AI run integration Wire a simple Claude/Gemini “run tests and fix failures” flow that: Modifies code, runs sandboxed tests, and creates autosave commits. Each spike should produce working code plus a short retro doc capturing surprises (especially gVisor quirks).

Here’s a concise implementation spike plan you can plug into your backlog.


Spike 1 – Workstation + gVisor + Agent

Goal: Prove Create/Exec/Terminate works reliably via runsc on a single Workstation and that basic metrics flow to a dev control plane.

Scope

  • One Google Cloud Workstation (or equivalent GCE VM) in a dev project.521522
  • Docker/containerd configured with gVisor runsc.523524525
  • Minimal Go (or Rust) Agent implementing CreateSandbox, ExecSandbox, DestroySandbox, ReportMetrics over gRPC.526527528

Tasks

  1. Workstation and gVisor setup
    • Provision a single Workstation VM.522521
    • Install runsc and integrate as Docker runtime (runsc install, update daemon.json, restart Docker).524525523
    • Sanity check: docker run --runtime=runsc alpine echo hello.
  2. Agent skeleton
    • Implement gRPC server with:
      • CreateSandbox: docker run --runtime=runsc ... sleep infinity.
      • ExecSandbox: docker exec with stdout/stderr streaming over gRPC.528529530526
      • DestroySandbox: docker rm -f.
      • ReportMetrics: periodically emit fake metrics (or parse docker stats/cgroups).531532533
  3. Dev control plane stub
    • Simple CLI or small Django view to:
      • Call CreateSandbox, then ExecSandbox with echo hello, then DestroySandbox.
      • Print outputs and metrics.

Exit criteria

  • Can create a gVisor sandbox, run a trivial command, see output, terminate it without orphan containers.
  • CPU/mem metrics for that sandbox are visible in logs or a simple dashboard.
  • Retro doc:
    • gVisor quirks observed (e.g., runsc perf, filesystem behaviors).533534535
    • Workstation image/permissions gotchas.

Spike 2 – End-to-end sandbox exec demo (React → Django → Agent → gVisor → React)

Goal: Demonstrate a full-path workflow: user triggers tests in UI, tests run in a gVisor sandbox on a demo repo, logs stream back live.

Scope

  • Existing React TS frontend and Django backend.
  • One dev Workstation + Agent from Spike 1.
  • Demo repo (Python with pytest) in a GitHub test org.

Tasks

  1. Django integration
    • Implement REST endpoints:
      • POST /v1/projects/{project_id}/sandboxes (single image, hardcoded resource limits).
      • POST /.../sandboxes/{id}/exec → opens gRPC Exec stream, exposes WebSocket stream_url.
    • Hardcode a mapping project_id -> workstation_id for now.
  2. React integration
    • Add a “Run tests in sandbox” button for a demo project.
    • On click:
      • POST /sandboxes → get sandbox ID.
      • POST /sandboxes/{id}/exec with ["pytest", "-q"] → get stream_url.
      • Connect to stream_url and render stdout/stderr live in UI.
  3. Sandbox teardown
    • On exec completion, auto-terminate sandbox or leave it running and show “Terminate” button.
    • Ensure idle/total timeouts configured (e.g., 300s).

Exit criteria

  • From the browser, you can:
    • Click “Run tests”, see a sandbox created, pytest run, and logs stream in real-time.
    • Terminate sandbox and confirm container removed.
  • Retro doc:
    • Latency/UX observations (cold start, streaming behavior).529536528
    • Any gVisor compatibility issues running pytest and common dependencies.535537533

Spike 3 – Autosave + AI run integration (Claude/Gemini)

Goal: Wire a minimal AI-assisted workflow: “run tests and fix failures,” with code changes + sandboxed tests + autosave commits to GitHub.

Scope

  • Same demo project repo as Spike 2, forked into a GitHub test org.
  • Simple Claude/Gemini integration via existing Coditect AI layer.

Tasks

  1. Autosave plumbing

    • In the Workstation’s workspace:
      • Create autosave branch (e.g., autosave/PROJ-1234/demo/user/ws-01/20260115).
      • Use git worktree for autosave operations.538539540541
    • Implement a small autosave daemon:
      • Every 30 seconds if dirty: git add -A, git commit -m "autosave: ...", git push origin autosave/....
  2. AI “fix tests” action

    • In UI: button “Fix failing tests with AI.”
    • Flow:
  3. Run tests in sandbox; capture failing tests summary.

  4. Send repo context + failures + goal to Claude/Gemini.

  5. Apply model’s patch to workspace.

  6. Trigger autosave; run tests again in sandbox.

  7. Visibility in UI

    • Show list of recent autosave commits (branch + commit ID).
    • Show AI runs (model, status, link to autosave diff).

Exit criteria

  • End-to-end demo:
    • Break a test intentionally.
    • Click “Fix failing tests with AI.”
    • AI suggests changes; autosave commits appear in GitHub; sandboxed tests re-run and pass (for simple cases).
  • Retro doc:
    • Where gVisor, Git, and model behavior interacted poorly (e.g., file watchers, pytest under gVisor).533535
    • Edge cases in autosave (conflicts, push races).

This gives you three very focused, demonstrable increments you can assign to small squads and use to de-risk gVisor + Workstations before scaling to full multi-tenant GA.


3. Hardening & operability checklist

Before committing this as the core Coditect runtime, line up a short checklist: Compatibility matrix: which languages/tools are officially supported under gVisor (Python, Node, Go, Rust toolchains) and known caveats. Abuse thresholds: concrete default limits for CPU, mem, network, sandboxes per tenant, and clear docs on what gets auto-killed vs just logged. Runbooks: “Sandbox stuck / cannot be terminated.” “Workstation agent unreachable.” “Tenant hits abuse heuristics repeatedly.” Feature flags: Ability to toggle gVisor vs runc per tenant/project for debugging. Ability to roll back sandbox features without breaking dev workflows.

Here’s a compact hardening & operability checklist you can turn into tickets.


1. Compatibility matrix (gVisor support)

Define a “supported stack” table backed by real tests under runsc.542543544545546547

  • Languages / runtimes
    • Python: 3.9–3.12, pip, pytest, uv, poetry.
    • Node.js: LTS versions, npm, yarn, pnpm.
    • Go: go toolchain, go test, go build.
    • Rust: rustc, cargo, rustfmt, clippy.
    • Shell: bash, zsh.
  • Tools and services
    • Git, SSH (if needed inside sandbox), curl/wget for limited networked sandboxes.
    • Databases: local sqlite only; external DB access only via backend (not from sandbox).
  • Known gVisor caveats (to document)
    • Potential perf overhead on syscall-heavy tools and file watchers.545546548
    • Limited / different behavior for some /proc and networking features.544548
    • GPU access: unsupported / experimental; GPU workloads not in v1.548549

Deliverable: a small “gVisor compatibility” page listing supported stacks and what’s not supported.


2. Abuse thresholds (defaults)

All thresholds are per sandbox and per tenant, with clear behaviors.

  • Per-sandbox defaults
    • CPU: 1 vCPU limit; auto-kill if CPU > 90% sustained for > 60s with no log output.
    • Memory: 1–2 GiB limit; kill on OOM; mark run as RESOURCE_EXCEEDED.
    • Runtime:
      • timeout_seconds default 300, max 86400.
      • idle_timeout_seconds default 60.
    • Network (if enabled):
      • Max connections/minute, max destinations/minute, domain allowlist only.
  • Per-tenant defaults
    • Max concurrent sandboxes: e.g., 5 (free), 10–20 (paid tiers).
    • vCPU-seconds/hour: soft limit with warnings at 80%, hard rejection at 100%.
    • Networked sandboxes: 0 by default; explicit opt-in for “networked tools”.
  • Behavior on breach
    • Sandbox-level breach → kill sandbox, log event, increment tenant abuse counter.
    • Tenant repeated breaches (e.g., 3 in 10 minutes) → throttle new sandboxes and require manual review.

Document this as a table (“What gets auto-killed vs just logged”) and expose summary in tenant admin UI.


3. Runbooks

3.1 “Sandbox stuck / cannot be terminated”

Symptoms: sandbox status TERMINATING, but container still exists or exec hangs.

Steps:

  1. Control Plane calls DestroySandbox (graceful).
  2. If no success within N seconds, Agent attempts docker rm -f.
  3. If still stuck:
    • Mark sandbox ERROR; stop sending new execs.
    • Emit alert tagged with sandbox_id, workstation_id.
    • Provide operator script:
      • SSH into Workstation, inspect with docker ps, docker kill, runsc logs.550551
  4. If sandbox failures correlate with specific image/tool, mark that combination as “unsupported” until fixed.

3.2 “Workstation agent unreachable”

Symptoms: gRPC errors or health checks failing for Agent.

Steps:

  1. Control Plane health-checks agents periodically; mark Workstation UNHEALTHY after M failures.
  2. Stop scheduling new sandboxes to that Workstation.
  3. If Workstation likely dead:
    • Mark associated sandboxes UNKNOWN; show “session lost” in UI.
    • Allow users to restart sessions (new Workstation).
  4. Operator actions:
    • Check Workstation status in GCP Workstations console.552553
    • Restart or recreate Workstation; redeploy Agent.

3.3 “Tenant hits abuse heuristics repeatedly”

Symptoms: frequent auto-kills for CPU/network/file abuse.

Steps:

  1. Threshold: e.g., ≥5 sandbox kills with ABUSE_SUSPECTED in 30 minutes.
  2. Automatically:
    • Lower tenant’s sandbox concurrency / CPU quotas temporarily.
    • Block networked sandboxes for that tenant.
  3. Notify:
    • Send email / in-app warning to tenant admins with log summary.
  4. Operator decision:
    • Lift or escalate restrictions after review.

4. Feature flags

Implement feature flags in the control plane config / DB:

  • sandbox_runtime_mode per tenant/project
    • Values: gvisor, runc, auto.
    • Default gvisor; runc allowed only for internal debugging and marked as unsafe.
    • Agents accept a runtime hint but may restrict to runsc in prod.
  • sandbox_enabled per tenant/project
    • Globally disable the sandbox feature for specific tenants or environments without impacting other project features.
  • sandbox_api_version
    • For future API evolution; allow routing tenants to v1 or v2 semantics (e.g., new timeouts, new fields) while keeping old ones stable.
  • autosave_enabled and ai_tools_enabled per project
    • Allow rollout/rollback of autosave and AI runs independently of sandbox runtime.

This allows safe experimentation, targeted rollbacks, and gradual enablement across tenants without breaking core dev workflows.

Footnotes

  1. https://pypi.org/project/modal/ 2

  2. https://github.com/modal-labs/modal-client/blob/main/pyproject.toml 2

  3. https://github.com/modal-labs/modal-client 2

  4. https://github.com/modal-labs/libmodal 2

  5. https://github.com/modal-labs/libmodal/

  6. https://modal.com/docs/guide/sandbox

  7. https://modal.com/products/sandboxes

  8. https://northflank.com/blog/top-modal-sandboxes-alternatives-for-secure-ai-code-execution

  9. https://www.linkedin.com/posts/modal-labs_modal-sandboxes-are-generally-available-activity-7287589012498755585-ERGa

  10. https://x.com/modal_labs/status/1881778355043012885

  11. https://www.linkedin.com/posts/pawalt_modal-sandboxes-are-generally-available-activity-7287543122111135744-b5_b

  12. https://modal.com/docs/examples/jupyter_sandbox

  13. https://deps.dev/pypi/modal-client/0.49.2437

  14. https://github.com/modal-labs/modal-examples/blob/main/13_sandboxes/codelangchain/agent.py

  15. interests.computing_infrastructure.modal_sandboxes

  16. https://dzone.com/articles/serverless-iam-architecture-with-security-lessons

  17. https://www.ranthebuilder.cloud/post/14-aws-lambda-security-best-practices-for-building-secure-serverless-applications

  18. https://docs.cloud.google.com/kubernetes-engine/docs/best-practices/enterprise-multitenancy

  19. https://github.com/modal-labs/modal-client/blob/main/CHANGELOG.md

  20. https://stackoverflow.com/questions/76590131/error-while-build-ios-app-in-xcode-sandbox-rsync-samba-13105-deny1-file-w

  21. http://faculty.washington.edu/wlloyd/courses/tcss562_f2024/presentations/2024/team-3.pdf

  22. https://docs.aws.amazon.com/pt_br/emr/latest/EMR-Serverless-UserGuide/emr-serverless-user-guide.pdf

  23. https://inspect.aisi.org.uk/sandboxing.html

  24. https://www.luiscardoso.dev/blog/sandboxes-for-ai

  25. https://www.ikangai.com/the-complete-guide-to-sandboxing-autonomous-agents-tools-frameworks-and-safety-essentials/

  26. https://modal.com/docs/reference/modal.Sandbox

  27. https://modal.com/docs/guide/sandboxes

  28. https://github.com/cased/sandboxes

  29. https://northflank.com/blog/top-modal-sandboxes-alternatives-for-secure-ai-code-execution

  30. https://modal.com/docs/examples/safe_code_execution

  31. https://northflank.com/blog/top-modal-sandboxes-alternatives-for-secure-ai-code-execution 2 3 4 5 6 7

  32. https://cased.com/blog/2025-10-05-sandboxes 2 3 4 5

  33. https://www.luiscardoso.dev/blog/sandboxes-for-ai 2 3 4 5 6 7 8 9 10

  34. https://www.koyeb.com/blog/top-sandbox-code-execution-platforms-for-ai-code-execution-2025 2 3 4

  35. https://betterstack.com/community/comparisons/best-sandbox-runners/

  36. https://northflank.com/blog/top-vercel-sandbox-alternatives-for-secure-ai-code-execution-and-sandbox-environments

  37. https://www.runpod.io/articles/alternatives/modal

  38. https://instavm.io/blog/sandboxed-ai-code-execution-tools

  39. https://github.com/cased/sandboxes 2

  40. https://simonwillison.net/2026/Jan/6/a-field-guide-to-sandboxes-for-ai/

  41. https://modal.com/solutions/coding-agents

  42. https://luiscardoso.dev/blog

  43. https://developer.salesforce.com/docs/commerce/b2c-commerce/guide/b2c-manage-sb.html

  44. https://simonw.substack.com/p/llm-predictions-for-2026-shared-with

  45. https://manus.im/blog/best-ai-coding-assistant-tools

  46. https://github.com/restyler/awesome-sandbox 2

  47. https://www.luiscardoso.dev/blog/sandboxes-for-ai 2 3 4 5 6

  48. https://www.tencentcloud.com/techpedia/118267 2 3 4

  49. https://github.com/jakhax/sandman 2

  50. https://www.reddit.com/r/docker/comments/1fmuv5b/kata_containers_vs_firecracker_vs_gvisor/

  51. https://dev.to/agentsphere/choosing-a-workspace-for-ai-agents-the-ultimate-showdown-between-gvisor-kata-and-firecracker-b10 2

  52. https://skywork.ai/skypage/en/chris-hays-code-sandbox-ai-engineers/1980120660590239744

  53. https://cased.com/blog/2025-10-05-sandboxes 2 3

  54. https://github.com/cased/sandboxes 2

  55. https://www.alldevblogs.com/article/simon-willison/a-field-guide-to-sandboxes-for-ai

  56. https://simonwillison.net/2026/Jan/6/a-field-guide-to-sandboxes-for-ai/

  57. https://northflank.com/blog/how-to-spin-up-a-secure-code-sandbox-and-microvm-in-seconds-with-northflank-firecracker-gvisor-kata-clh

  58. https://developers.cloudflare.com/sandbox/

  59. https://www.youtube.com/watch?v=sVtqsH5oG4c

  60. https://www.youtube.com/watch?v=sV8HKlwsFag

  61. https://www.luiscardoso.dev/blog/sandboxes-for-ai 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

  62. https://zesty.co/finops-glossary/gvisor-in-kubernetes/ 2 3 4 5 6 7 8 9 10 11 12 13 14

  63. https://cloudification.io/cloud-blog/kata-containers-workload-isolation/ 2 3 4 5 6 7 8 9 10 11 12 13 14 15

  64. https://github.com/firecracker-microvm/firecracker 2 3 4 5 6 7 8 9 10 11 12

  65. https://news.ycombinator.com/item?id=19921564 2 3 4 5

  66. https://github.com/open-lambda/gvisor 2 3

  67. https://www.anantacloud.com/post/transforming-kubernetes-security-how-kata-containers-improve-workload-isolation 2 3 4 5 6 7

  68. https://aws.amazon.com/blogs/containers/enhancing-kubernetes-workload-isolation-and-security-using-kata-containers/ 2 3 4 5

  69. https://www.anthony-balitrand.fr/2025/08/12/firecracker-microvms-the-power-behind-aws-lambda/ 2 3 4 5 6 7 8 9 10 11

  70. https://firecracker-microvm.github.io 2 3 4 5

  71. https://wasmer.io/wasmer-vs-wasmtime 2 3

  72. https://github.com/google/gvisor

  73. https://www.tencentcloud.com/techpedia/118267

  74. https://gvisor.dev/docs/

  75. https://gvisor.dev/docs/user_guide/production/

  76. https://www.upwind.io/feed/unlock-runtime-visibility-for-gvisor-sandboxed-containers

  77. https://kubernetes.io/docs/concepts/containers/runtime-class/ 2 3

  78. https://zesty.co/finops-glossary/gvisor-in-kubernetes/ 2 3

  79. https://aws.amazon.com/blogs/containers/enhancing-kubernetes-workload-isolation-and-security-using-kata-containers/ 2 3 4

  80. https://devopstales.github.io/kubernetes/firecracker-containerd/ 2 3 4

  81. https://www.cncf.io/blog/2024/03/28/webassembly-on-kubernetes-the-practice-guide-part-02/ 2 3 4 5 6 7

  82. https://gvisor.dev/docs/user_guide/quick_start/kubernetes/ 2

  83. https://egashira.dev/blog/gvisor-on-kubernetes-cluster 2 3

  84. https://labs.iximiuz.com/tutorials/kubernetes-runtime-class-61506808

  85. https://www.nops.io/blog/how-to-run-webassembly-on-kubernetes/ 2 3

  86. https://www.youtube.com/watch?v=nV2UCE5iWAU 2 3

  87. http://arun-gupta.github.io/kata-firecracker/ 2

  88. https://github.com/kata-containers/kata-containers/blob/main/docs/how-to/how-to-use-kata-containers-with-firecracker.md 2

  89. https://dev.to/signadot/creating-sandboxes-in-kubernetes-at-scale-5f6p 2

  90. https://notes.kodekloud.com/docs/Certified-Kubernetes-Security-Specialist-CKS/Minimize-Microservice-Vulnerabilities/Using-Runtimes-in-Kubernetes

  91. https://www.alibabacloud.com/blog/getting-started-with-kubernetes-|-understanding-kubernetes-runtimeclass-and-using-multiple-container-runtimes_596341

  92. https://kubernetes.io/docs/concepts/containers/runtime-class/ 2 3 4

  93. https://zesty.co/finops-glossary/gvisor-in-kubernetes/ 2 3 4 5

  94. https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/ 2 3

  95. https://blog.sighup.io/how-to-run-untrusted-containers-in-kubernetes/ 2 3 4 5 6 7

  96. https://aws.amazon.com/blogs/containers/enhancing-kubernetes-workload-isolation-and-security-using-kata-containers/ 2 3 4

  97. https://www.cncf.io/blog/2024/03/28/webassembly-on-kubernetes-the-practice-guide-part-02/ 2 3 4 5 6

  98. https://www.appsecengineer.com/blog/defending-kubernetes-clusters-against-container-escape-attacks 2 3 4 5

  99. https://news.ycombinator.com/item?id=19921564

  100. https://cloudification.io/cloud-blog/kata-containers-workload-isolation/ 2

  101. https://www.youtube.com/watch?v=0wEiizErKZw

  102. https://devopstales.github.io/kubernetes/firecracker-containerd/ 2 3 4

  103. https://arxiv.org/html/2509.09400v1 2 3 4

  104. https://news.ycombinator.com/item?id=34081170

  105. https://notes.kodekloud.com/docs/Certified-Kubernetes-Security-Specialist-CKS/Minimize-Microservice-Vulnerabilities/Using-Runtimes-in-Kubernetes 2

  106. https://docs.catalystcloud.nz/tutorials/kubernetes/sandboxed-containers-with-gvisor.html

  107. http://arun-gupta.github.io/kata-firecracker/ 2

  108. https://github.com/kata-containers/kata-containers/blob/main/docs/how-to/how-to-use-kata-containers-with-firecracker.md 2

  109. https://www.nops.io/blog/how-to-run-webassembly-on-kubernetes/ 2

  110. https://www.alibabacloud.com/blog/getting-started-with-kubernetes-|-understanding-kubernetes-runtimeclass-and-using-multiple-container-runtimes_596341

  111. https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_security_for_kubernetes/3.73/html/operating/use-admission-controller-enforcement 2

  112. https://www.sysdig.com/learn-cloud-native/kubernetes-admission-controllers

  113. https://www.vcluster.com/docs/vcluster/0.27.0/configure/vcluster-yaml/sync/from-host/runtime-classes 2 3

  114. https://kubeops.net/blog/effective-container-isolation-techniques-for-secure-kubernetes 2 3 4 5

  115. https://dev.to/signadot/creating-sandboxes-in-kubernetes-at-scale-5f6p

  116. https://github.com/firecracker-microvm/firecracker/issues/908

  117. https://cloud.google.com/blog/products/containers-kubernetes/how-gvisor-protects-google-cloud-services-from-cve-2020-14386 2 3 4

  118. https://gvisor.dev/docs/user_guide/compatibility/ 2 3

  119. https://docs.cloud.google.com/kubernetes-engine/docs/concepts/sandbox-pods 2 3 4

  120. https://katacontainers.io/blog/kata-containers-northflank-case-study/ 2 3 4 5

  121. https://publish.obsidian.md/kruzenshtern/writings/2021-02-24-running-firecracker-on-google-kubernetes-engine 2 3 4 5

  122. https://gvisor.dev 2 3 4

  123. https://cloud.google.com/blog/products/containers-kubernetes/gvisor-file-system-improvements-for-gke-and-serverless

  124. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2

  125. https://github.com/kata-containers/kata-containers/issues/10536 2 3

  126. https://devopstales.github.io/kubernetes/firecracker-containerd/

  127. http://arun-gupta.github.io/kata-firecracker/

  128. https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-operator-kata.html

  129. https://docs.cloud.google.com/kubernetes-engine/docs/how-to/configure-gke-service-extensions 2

  130. https://cloud.google.com/blog/products/application-development/go-1-24-expands-support-for-wasm 2

  131. https://itnext.io/webassembly-on-kubernetes-c5c652e8c1f1 2

  132. https://www.nops.io/blog/how-to-run-webassembly-on-kubernetes/

  133. https://www.cncf.io/blog/2024/03/28/webassembly-on-kubernetes-the-practice-guide-part-02/ 2 3

  134. https://arxiv.org/html/2509.09400v1

  135. https://stackoverflow.com/questions/69846927/how-to-run-untrusted-code-using-gvisor-on-google-cloud-run

  136. https://news.ycombinator.com/item?id=37253921

  137. https://stackoverflow.com/questions/69846927/how-to-run-untrusted-code-using-gvisor-on-google-cloud-run 2 3

  138. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2 3 4 5 6 7 8 9 10

  139. https://gvisor.dev/docs/user_guide/install/ 2 3 4 5

  140. https://docs.readthedocs.com/dev/latest/guides/gvisor.html 2 3 4

  141. https://gvisor.dev 2 3 4 5 6

  142. https://github.com/google/gvisor/issues/11069

  143. https://dev.to/rimelek/comparing-3-docker-container-runtimes-runc-gvisor-and-kata-containers-16j 2

  144. https://cloud.google.com/blog/products/containers-kubernetes/gvisor-file-system-improvements-for-gke-and-serverless 2 3

  145. https://dev.to/rimelek/using-gvisors-container-runtime-in-docker-desktop-374m

  146. https://www.packtpub.com/sa-th/learning/tech-news/gke-sandbox-a-gvisor-based-feature-to-increase-security-and-isolation-in-containers

  147. https://docs.cloud.google.com/kubernetes-engine/docs/concepts/sandbox-pods

  148. https://www.trendmicro.com/cloudoneconformity/knowledge-base/gcp/GKE/enable-gke-sandbox-with-gvisor.html

  149. https://zesty.co/finops-glossary/gvisor-in-kubernetes/

  150. https://gvisor.dev/docs/user_guide/containerd/quick_start/

  151. https://gvisor.dev/docs/user_guide/platforms/

  152. https://grpc.io/docs/languages/go/basics/ 2 3

  153. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming 2 3

  154. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2 3 4

  155. https://gvisor.dev 2 3

  156. https://gvisor.dev/docs/user_guide/install/ 2 3

  157. https://techdozo.dev/grpc-bidirectional-streaming-with-code-example/ 2

  158. https://github.com/google/gvisor/issues/190 2 3

  159. https://gvisor.dev/docs/architecture_guide/performance/ 2 3

  160. https://github.com/Datadog/datadog-agent/issues/44084 2 3

  161. https://docs.readthedocs.com/dev/latest/guides/gvisor.html

  162. https://rewind.com/blog/build-vs-buy-backup-solutions-github/

  163. https://www.reddit.com/r/github/comments/ujvsdb/github_backup/

  164. https://github.com/erikw/restic-automatic-backup-scheduler

  165. https://cloud.google.com/blog/products/containers-kubernetes/gvisor-file-system-improvements-for-gke-and-serverless

  166. https://dev.to/rimelek/comparing-3-docker-container-runtimes-runc-gvisor-and-kata-containers-16j

  167. https://github.com/pahanini/go-grpc-bidirectional-streaming-example

  168. https://dev.to/yash_mahakal/implementing-bidirectional-grpc-streaming-a-practical-guide-3afi

  169. https://grpc.io/docs/languages/cpp/basics/

  170. https://stackoverflow.com/questions/56076703/how-do-i-make-sure-that-a-message-was-received-in-grpc-bidirectional-streaming

  171. https://www.thesocialrobot.org/posts/grpc-brain-2/

  172. https://zesty.co/finops-glossary/gvisor-in-kubernetes/

  173. https://grpc.io/docs/languages/go/basics/ 2 3 4

  174. https://grpc.io/docs/guides/auth/ 2 3

  175. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2 3

  176. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2

  177. https://gvisor.dev 2

  178. https://gvisor.dev/docs/user_guide/install/ 2 3

  179. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming 2

  180. https://victoriametrics.com/blog/go-grpc-basic-streaming-interceptor/ 2 3

  181. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g 2 3

  182. https://nanikgolang.netlify.app/post/runsc/ 2

  183. https://docs.readthedocs.com/dev/latest/guides/gvisor.html

  184. https://grpc.io/docs/guides/performance/ 2 3

  185. https://knabben.github.io/2023/0608/ 2

  186. https://doc.akka.io/libraries/akka-grpc/current/mtls.html

  187. https://techdozo.dev/grpc-bidirectional-streaming-with-code-example/

  188. https://gvisor.dev/docs/tutorials/docker-in-gvisor/ 2 3

  189. https://cloud.google.com/blog/products/containers-kubernetes/gvisor-file-system-improvements-for-gke-and-serverless

  190. https://gvisor.dev/docs/architecture_guide/performance/ 2 3 4

  191. https://dev.to/rimelek/comparing-3-docker-container-runtimes-runc-gvisor-and-kata-containers-16j 2

  192. https://github.com/google/gvisor/issues/190 2

  193. https://github.com/Datadog/datadog-agent/issues/44084 2

  194. https://blog.sighup.io/how-to-run-untrusted-containers-in-kubernetes/

  195. https://www.appsecengineer.com/blog/defending-kubernetes-clusters-against-container-escape-attacks

  196. https://gvisor.dev/docs/user_guide/compatibility/

  197. https://dev.to/yash_mahakal/implementing-bidirectional-grpc-streaming-a-practical-guide-3afi

  198. https://www.talentica.com/blogs/part-3-building-a-bidirectional-streaming-grpc-service-using-golang/

  199. https://dev.to/rimelek/using-gvisors-container-runtime-in-docker-desktop-374m

  200. https://www.infracloud.io/blogs/understanding-grpc-concepts-best-practices/

  201. https://about.gitlab.com/blog/keeping-git-commit-history-clean/ 2

  202. https://sethrobertson.github.io/GitBestPractices/ 2 3

  203. https://justinjoyce.dev/git-commit-and-commit-message-best-practices/ 2 3

  204. https://stackoverflow.com/questions/273695/what-are-some-examples-of-commonly-used-practices-for-naming-git-branches

  205. https://www.geeksforgeeks.org/git/how-to-naming-conventions-for-git-branches/ 2 3

  206. https://digilent.com/reference/software/development/git/start 2

  207. https://pullpanda.io/blog/git-branch-naming-conventions-best-practices 2

  208. https://www.zignuts.com/blog/master-git-branch-naming-conventions

  209. https://gist.github.com/luismts/495d982e8c5b1a0ced4a57cf3d93cf60

  210. https://dev.to/shnjd/git-good-best-practices-for-branch-naming-and-commit-messages-oj4

  211. https://www.reddit.com/r/git/comments/1b1ormd/best_practices_for_automation_of_private_local/ 2 3 4

  212. https://conventional-branch.github.io

  213. https://dev.to/varbsan/a-simplified-convention-for-naming-branches-and-commits-in-git-il4

  214. https://graphite.com/guides/git-branch-naming-conventions

  215. https://www.reddit.com/r/git/comments/xddr5d/manual_branches_names_convention/

  216. https://sethrobertson.github.io/GitBestPractices/ 2

  217. https://www.reddit.com/r/git/comments/1b1ormd/best_practices_for_automation_of_private_local/

  218. https://about.gitlab.com/blog/keeping-git-commit-history-clean/

  219. http://contextkeeper.io/blog/continuous-auto-saving-branch-snapshots-and-git-worktree-support/ 2 3

  220. https://git-scm.com/docs/git-worktree 2 3

  221. https://nx.dev/blog/git-worktrees-ai-agents 2 3

  222. https://www.datacamp.com/pt/tutorial/git-worktree-tutorial 2 3

  223. https://dev.to/jps27cse/github-branching-name-best-practices-49ei 2

  224. https://architecture.lullabot.com/adr/20220920-git-branch-naming/ 2

  225. https://graphite.com/guides/git-branch-naming-conventions 2

  226. https://gist.github.com/jasonk/c29679fa77f4c81d20a31608795ab265

  227. https://about.gitlab.com/blog/keeping-git-commit-history-clean/

  228. https://justinjoyce.dev/git-commit-and-commit-message-best-practices/

  229. https://dev.to/shnjd/git-good-best-practices-for-branch-naming-and-commit-messages-oj4

  230. https://verreauxblack.hashnode.dev/git-release-management-should-you-use-branches-or-tags

  231. https://circleci.com/blog/git-tags-vs-branches/

  232. https://dev.to/livecodelife/how-i-supercharged-my-workflow-with-git-worktrees-2jgj

  233. https://stackoverflow.com/questions/31935776/what-would-i-use-git-worktree-for

  234. https://www.reddit.com/r/golang/comments/s0m0vz/showcase_autosaved_a_utility_that_autosaves/

  235. https://github.com/mateimicu/auto-tag

  236. https://irskep.github.io/autowt/

  237. https://www.luiscardoso.dev/blog/sandboxes-for-ai

  238. https://zesty.co/finops-glossary/gvisor-in-kubernetes/

  239. https://aws.amazon.com/blogs/containers/enhancing-kubernetes-workload-isolation-and-security-using-kata-containers/

  240. https://cloudification.io/cloud-blog/kata-containers-workload-isolation/

  241. https://github.com/firecracker-microvm/firecracker

  242. https://www.cncf.io/blog/2024/03/28/webassembly-on-kubernetes-the-practice-guide-part-02/ 2

  243. https://kubernetes.io/docs/concepts/containers/runtime-class/

  244. https://gvisor.dev/docs/user_guide/quick_start/kubernetes/

  245. https://devopstales.github.io/kubernetes/firecracker-containerd/

  246. https://labs.iximiuz.com/tutorials/kubernetes-runtime-class-61506808

  247. https://cloud.google.com/blog/products/containers-kubernetes/how-gvisor-protects-google-cloud-services-from-cve-2020-14386

  248. https://cloud.google.com/blog/products/containers-kubernetes/gvisor-file-system-improvements-for-gke-and-serverless

  249. https://docs.cloud.google.com/kubernetes-engine/docs/concepts/sandbox-pods

  250. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2 3

  251. https://gvisor.dev 2 3

  252. https://gvisor.dev/docs/user_guide/install/ 2

  253. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2

  254. https://grpc.io/docs/languages/go/basics/ 2

  255. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming

  256. https://victoriametrics.com/blog/go-grpc-basic-streaming-interceptor/ 2

  257. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g 2

  258. https://docs.readthedocs.com/dev/latest/guides/gvisor.html

  259. https://nanikgolang.netlify.app/post/runsc/

  260. https://grpc.io/docs/guides/auth/

  261. https://knabben.github.io/2023/0608/

  262. https://grpc.io/docs/guides/performance/ 2

  263. https://techdozo.dev/grpc-bidirectional-streaming-with-code-example/

  264. https://gvisor.dev/docs/user_guide/compatibility/

  265. https://gvisor.dev/docs/architecture_guide/performance/

  266. https://gvisor.dev/docs/tutorials/docker-in-gvisor/

  267. https://dev.to/rimelek/comparing-3-docker-container-runtimes-runc-gvisor-and-kata-containers-16j

  268. https://stackoverflow.com/questions/273695/what-are-some-examples-of-commonly-used-practices-for-naming-git-branches

  269. https://www.geeksforgeeks.org/git/how-to-naming-conventions-for-git-branches/

  270. https://dev.to/jps27cse/github-branching-name-best-practices-49ei

  271. https://architecture.lullabot.com/adr/20220920-git-branch-naming/

  272. https://graphite.com/guides/git-branch-naming-conventions

  273. http://contextkeeper.io/blog/continuous-auto-saving-branch-snapshots-and-git-worktree-support/

  274. https://git-scm.com/docs/git-worktree

  275. https://nx.dev/blog/git-worktrees-ai-agents

  276. https://www.datacamp.com/pt/tutorial/git-worktree-tutorial

  277. https://about.gitlab.com/blog/keeping-git-commit-history-clean/

  278. https://sethrobertson.github.io/GitBestPractices/

  279. https://www.reddit.com/r/git/comments/1b1ormd/best_practices_for_automation_of_private_local/

  280. https://justinjoyce.dev/git-commit-and-commit-message-best-practices/

  281. https://circleci.com/blog/git-tags-vs-branches/

  282. https://gvisor.dev 2 3 4 5 6

  283. https://docs.cloud.google.com/workstations/docs/architecture 2 3 4 5 6

  284. https://cloud.google.com/workstations/docs/architecture 2 3 4 5 6

  285. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2 3 4

  286. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2 3 4 5 6 7

  287. https://docs.readthedocs.com/dev/latest/guides/gvisor.html 2 3 4 5 6 7 8

  288. http://contextkeeper.io/blog/continuous-auto-saving-branch-snapshots-and-git-worktree-support/ 2 3 4 5

  289. https://git-scm.com/docs/git-worktree 2 3 4

  290. https://nx.dev/blog/git-worktrees-ai-agents 2 3 4

  291. https://www.datacamp.com/pt/tutorial/git-worktree-tutorial 2 3 4

  292. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g 2 3 4 5 6

  293. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming 2 3 4 5

  294. https://programmingpercy.tech/blog/streaming-data-with-grpc/ 2 3 4 5

  295. https://github.com/google/gvisor/issues/190 2

  296. https://github.com/Datadog/datadog-agent/issues/44084 2

  297. https://gvisor.dev/docs/architecture_guide/performance/ 2 3

  298. https://www.reddit.com/r/golang/comments/s0m0vz/showcase_autosaved_a_utility_that_autosaves/

  299. https://cloud.google.com/blog/products/containers-kubernetes/how-gvisor-protects-google-cloud-services-from-cve-2020-14386 2

  300. https://gvisor.dev/docs/user_guide/production/ 2 3 4

  301. https://github.com/google/gvisor 2 3

  302. https://grpc.io/docs/guides/auth/ 2

  303. https://knabben.github.io/2023/0608/

  304. https://grpc.io/docs/guides/performance/ 2

  305. https://cloud.google.com/blog/products/containers-kubernetes/gvisor-file-system-improvements-for-gke-and-serverless

  306. https://gvisor.dev/docs/user_guide/install/ 2

  307. https://grpc.io/docs/languages/go/basics/

  308. https://victoriametrics.com/blog/go-grpc-basic-streaming-interceptor/

  309. https://aws.amazon.com/blogs/containers/enhancing-kubernetes-workload-isolation-and-security-using-kata-containers/

  310. https://devopstales.github.io/kubernetes/firecracker-containerd/

  311. https://cloudification.io/cloud-blog/kata-containers-workload-isolation/

  312. https://stackoverflow.com/questions/31935776/what-would-i-use-git-worktree-for

  313. https://irskep.github.io/autowt/

  314. https://gvisor.dev/docs/tutorials/docker-in-gvisor/

  315. https://gvisor.dev/docs/user_guide/compatibility/

  316. https://dev.to/rimelek/using-gvisors-container-runtime-in-docker-desktop-374m

  317. https://zesty.co/finops-glossary/gvisor-in-kubernetes/

  318. https://docs.cloud.google.com/workstations/docs/architecture 2 3 4 5 6 7 8

  319. https://cloud.google.com/workstations/docs/architecture 2 3 4 5 6 7 8

  320. https://gvisor.dev 2 3 4 5 6 7 8

  321. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2 3 4 5 6 7

  322. https://docs.readthedocs.com/dev/latest/guides/gvisor.html 2 3 4

  323. https://gvisor.dev/docs/user_guide/production/ 2 3 4 5 6

  324. https://github.com/google/gvisor 2 3

  325. https://gvisor.dev/docs/user_guide/install/ 2 3 4

  326. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming 2 3

  327. https://programmingpercy.tech/blog/streaming-data-with-grpc/ 2 3

  328. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g 2 3 4

  329. https://cloud.google.com/blog/products/containers-kubernetes/how-gvisor-protects-google-cloud-services-from-cve-2020-14386 2

  330. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2 3 4 5 6

  331. http://contextkeeper.io/blog/continuous-auto-saving-branch-snapshots-and-git-worktree-support/

  332. https://git-scm.com/docs/git-worktree

  333. https://nx.dev/blog/git-worktrees-ai-agents

  334. https://www.datacamp.com/pt/tutorial/git-worktree-tutorial

  335. https://grpc.io/docs/guides/auth/ 2

  336. https://cloud.google.com/blog/products/containers-kubernetes/gvisor-file-system-improvements-for-gke-and-serverless 2

  337. https://gvisor.dev/docs/architecture_guide/performance/ 2 3 4 5

  338. https://gvisor.dev/docs/user_guide/compatibility/ 2

  339. https://gvisor.dev/docs/tutorials/docker-in-gvisor/ 2 3 4

  340. https://dev.to/rimelek/comparing-3-docker-container-runtimes-runc-gvisor-and-kata-containers-16j

  341. https://github.com/google/gvisor/issues/190

  342. https://github.com/Datadog/datadog-agent/issues/44084

  343. https://knabben.github.io/2023/0608/

  344. https://grpc.io/docs/guides/performance/

  345. https://aws.amazon.com/blogs/containers/enhancing-kubernetes-workload-isolation-and-security-using-kata-containers/ 2

  346. https://devopstales.github.io/kubernetes/firecracker-containerd/ 2

  347. https://cloudification.io/cloud-blog/kata-containers-workload-isolation/ 2

  348. https://docs.cloud.google.com/kubernetes-engine/docs/concepts/sandbox-pods

  349. https://zesty.co/finops-glossary/gvisor-in-kubernetes/

  350. https://github.com/firecracker-microvm/firecracker 2

  351. https://docs.cloud.google.com/workstations/docs/access-control 2 3 4 5

  352. https://docs.cloud.google.com/iam/docs/roles-permissions/workstations 2 3 4 5

  353. https://notes.kodekloud.com/docs/Certified-Kubernetes-Security-Specialist-CKS/Minimize-Microservice-Vulnerabilities/gVisor 2 3

  354. https://cloud.google.com/workstations/docs/access-control 2 3 4

  355. https://gvisor.dev 2 3

  356. https://docs.cloud.google.com/workstations/docs/architecture 2 3

  357. https://cloud.google.com/workstations/docs/architecture 2 3

  358. https://fotc.com/blog/cloud-workstations/ 2 3 4 5

  359. https://blog.salrashid.dev/articles/2019/kubernetes_oidc_gcp_identity_platform/ 2

  360. https://www.pulumi.com/guides/how-to/gcp-workstations-iam-binding/ 2

  361. https://www.skills.google/focuses/60391?parent=catalog

  362. https://grpc.io/docs/guides/auth/

  363. https://knabben.github.io/2023/0608/

  364. https://zesty.co/finops-glossary/gvisor-in-kubernetes/ 2

  365. https://gvisor.dev/docs/user_guide/quick_start/docker/

  366. https://docs.readthedocs.com/dev/latest/guides/gvisor.html

  367. https://gvisor.dev/docs/user_guide/production/

  368. https://cloud.google.com/workstations

  369. https://dzone.com/articles/google-cloud-workstations

  370. https://www.youtube.com/watch?v=-3pdAavNepg

  371. https://cloud.google.com/solutions/sap/docs/bq-connector/latest/authentication-jwt

  372. https://www.googlecloudcommunity.com/gc/Apigee/Accessing-a-Google-Cloud-Platform-based-service-using-JWT-and-a/m-p/21306

  373. https://docs.cloud.google.com/workstations/docs/access-control 2

  374. https://docs.cloud.google.com/iam/docs/roles-permissions/workstations 2 3 4 5 6

  375. https://cloud.google.com/workstations/docs/access-control 2 3

  376. https://www.skills.google/focuses/60391?parent=catalog

  377. https://fotc.com/blog/cloud-workstations/

  378. https://cloud.google.com/solutions/sap/docs/bq-connector/latest/authentication-jwt 2

  379. https://www.youtube.com/watch?v=-3pdAavNepg

  380. https://grpc.io/docs/guides/auth/

  381. https://knabben.github.io/2023/0608/

  382. https://docs.cloud.google.com/workstations/docs/architecture

  383. https://cloud.google.com/workstations/docs/architecture

  384. https://www.pulumi.com/guides/how-to/gcp-workstations-iam-binding/

  385. https://gvisor.dev/docs/user_guide/production/ 2

  386. https://cloud.google.com/blog/products/containers-kubernetes/how-gvisor-protects-google-cloud-services-from-cve-2020-14386

  387. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime

  388. https://gvisor.dev

  389. https://notes.kodekloud.com/docs/Certified-Kubernetes-Security-Specialist-CKS/Minimize-Microservice-Vulnerabilities/gVisor

  390. https://fotc.com/blog/cloud-workstations/

  391. https://github.com/google/gvisor

  392. https://zesty.co/finops-glossary/gvisor-in-kubernetes/

  393. https://docs.cloud.google.com/workstations/docs/architecture

  394. https://cloud.google.com/workstations/docs/architecture

  395. https://dzone.com/articles/google-cloud-workstations

  396. https://gvisor.dev/docs/user_guide/production/ 2

  397. https://github.com/google/gvisor

  398. https://gvisor.dev

  399. https://docs.cloud.google.com/workstations/docs/architecture

  400. https://cloud.google.com/workstations/docs/architecture

  401. https://fotc.com/blog/cloud-workstations/

  402. https://grpc.io/docs/guides/auth/

  403. https://knabben.github.io/2023/0608/

  404. https://gvisor.dev/docs/user_guide/quick_start/docker/

  405. https://docs.readthedocs.com/dev/latest/guides/gvisor.html

  406. https://zesty.co/finops-glossary/gvisor-in-kubernetes/ 2

  407. https://notes.kodekloud.com/docs/Certified-Kubernetes-Security-Specialist-CKS/Minimize-Microservice-Vulnerabilities/gVisor

  408. https://docs.cloud.google.com/workstations/docs/architecture 2

  409. https://cloud.google.com/workstations/docs/architecture 2

  410. https://gvisor.dev/docs/user_guide/production/ 2 3 4 5

  411. https://github.com/google/gvisor 2 3 4

  412. https://cloud.google.com/blog/products/containers-kubernetes/how-gvisor-protects-google-cloud-services-from-cve-2020-14386 2

  413. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2

  414. https://gvisor.dev 2 3 4

  415. https://grpc.io/docs/guides/auth/ 2 3

  416. https://knabben.github.io/2023/0608/ 2 3

  417. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2

  418. https://docs.readthedocs.com/dev/latest/guides/gvisor.html 2

  419. https://fotc.com/blog/cloud-workstations/ 2

  420. https://cloud.google.com/solutions/sap/docs/bq-connector/latest/authentication-jwt

  421. https://www.youtube.com/watch?v=-3pdAavNepg

  422. https://zesty.co/finops-glossary/gvisor-in-kubernetes/ 2

  423. https://notes.kodekloud.com/docs/Certified-Kubernetes-Security-Specialist-CKS/Minimize-Microservice-Vulnerabilities/gVisor

  424. https://github.com/google/gvisor/issues/190

  425. https://github.com/Datadog/datadog-agent/issues/44084

  426. https://gvisor.dev/docs/architecture_guide/performance/

  427. https://gvisor.dev 2 3

  428. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2 3 4 5

  429. https://gvisor.dev/docs/user_guide/production/ 2 3 4 5

  430. https://github.com/google/gvisor 2

  431. https://docs.readthedocs.com/dev/latest/guides/gvisor.html 2

  432. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming

  433. https://programmingpercy.tech/blog/streaming-data-with-grpc/

  434. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g

  435. https://gvisor.dev/docs/architecture_guide/performance/ 2

  436. https://gvisor.dev/docs/tutorials/docker-in-gvisor/ 2

  437. https://cloud.google.com/blog/products/containers-kubernetes/how-gvisor-protects-google-cloud-services-from-cve-2020-14386

  438. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime

  439. http://contextkeeper.io/blog/continuous-auto-saving-branch-snapshots-and-git-worktree-support/

  440. https://git-scm.com/docs/git-worktree

  441. https://nx.dev/blog/git-worktrees-ai-agents

  442. https://www.datacamp.com/pt/tutorial/git-worktree-tutorial

  443. https://github.com/google/gvisor/issues/190

  444. https://github.com/Datadog/datadog-agent/issues/44084

  445. https://notes.kodekloud.com/docs/Certified-Kubernetes-Security-Specialist-CKS/Minimize-Microservice-Vulnerabilities/gVisor

  446. https://zesty.co/finops-glossary/gvisor-in-kubernetes/

  447. https://fotc.com/blog/cloud-workstations/

  448. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2

  449. https://gvisor.dev/docs/user_guide/production/

  450. https://github.com/google/gvisor

  451. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime

  452. https://gvisor.dev

  453. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g 2 3

  454. https://dzone.com/articles/advanced-grpc-in-microservices 2

  455. https://victoriametrics.com/blog/go-grpc-basic-streaming-interceptor/ 2 3

  456. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming 2 3

  457. https://programmingpercy.tech/blog/streaming-data-with-grpc/

  458. https://gvisor.dev/docs/user_guide/faq/

  459. https://dev.to/rimelek/using-gvisors-container-runtime-in-docker-desktop-374m

  460. https://grpc.io/docs/guides/auth/

  461. https://knabben.github.io/2023/0608/

  462. https://blog.salrashid.dev/articles/2019/kubernetes_oidc_gcp_identity_platform/

  463. https://github.com/kubernetes/minikube/issues/5463

  464. https://pkg.go.dev/gvisor.dev/gvisor/runsc/config

  465. https://github.com/google/gvisor/issues/9368

  466. https://gvisor.dev/docs/architecture_guide/performance/

  467. https://devsecops.puziol.com.br/kubernetes/cks/solved-questions/question-10-gvisor-runtime-sandbox/

  468. https://grpc.io/docs/guides/performance/

  469. https://stackoverflow.com/questions/72392812/debug-logs-are-not-generated-for-sandboxed-container

  470. https://www.baeldung.com/java-grpc-streaming

  471. https://falco.org/docs/concepts/event-sources/gvisor/

  472. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming

  473. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g 2

  474. https://dzone.com/articles/advanced-grpc-in-microservices

  475. https://victoriametrics.com/blog/go-grpc-basic-streaming-interceptor/

  476. https://grpc.io/docs/guides/performance/

  477. https://cloud.google.com/solutions/sap/docs/bq-connector/latest/authentication-jwt

  478. https://www.youtube.com/watch?v=-3pdAavNepg

  479. https://grpc.io/docs/guides/auth/

  480. https://knabben.github.io/2023/0608/

  481. https://blog.salrashid.dev/articles/2019/kubernetes_oidc_gcp_identity_platform/

  482. https://github.com/google/gvisor/issues/190

  483. https://github.com/Datadog/datadog-agent/issues/44084

  484. https://gvisor.dev/docs/architecture_guide/performance/

  485. https://gvisor.dev/docs/user_guide/quick_start/docker/

  486. https://dev.to/rimelek/using-gvisors-container-runtime-in-docker-desktop-374m

  487. https://gvisor.dev/docs/user_guide/faq/ 2

  488. https://devsecops.puziol.com.br/kubernetes/cks/solved-questions/question-10-gvisor-runtime-sandbox/ 2

  489. https://github.com/kubernetes/minikube/issues/5463

  490. https://docs.cloud.google.com/workstations/docs/architecture 2

  491. https://cloud.google.com/workstations/docs/architecture 2

  492. https://gvisor.dev 2 3 4

  493. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2 3 4 5 6

  494. https://gvisor.dev/docs/user_guide/production/ 2 3

  495. https://github.com/google/gvisor 2 3

  496. https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime 2

  497. https://gvisor.dev/docs/user_guide/install/ 2

  498. https://docs.readthedocs.com/dev/latest/guides/gvisor.html

  499. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming

  500. https://programmingpercy.tech/blog/streaming-data-with-grpc/

  501. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g

  502. https://victoriametrics.com/blog/go-grpc-basic-streaming-interceptor/ 2

  503. http://contextkeeper.io/blog/continuous-auto-saving-branch-snapshots-and-git-worktree-support/ 2

  504. https://git-scm.com/docs/git-worktree 2

  505. https://nx.dev/blog/git-worktrees-ai-agents 2

  506. https://www.datacamp.com/pt/tutorial/git-worktree-tutorial 2

  507. https://grpc.io/docs/guides/auth/

  508. https://knabben.github.io/2023/0608/

  509. https://grpc.io/docs/guides/performance/

  510. https://github.com/google/gvisor/issues/190

  511. https://github.com/Datadog/datadog-agent/issues/44084

  512. https://gvisor.dev/docs/architecture_guide/performance/

  513. https://notes.kodekloud.com/docs/Certified-Kubernetes-Security-Specialist-CKS/Minimize-Microservice-Vulnerabilities/gVisor

  514. https://zesty.co/finops-glossary/gvisor-in-kubernetes/

  515. https://fotc.com/blog/cloud-workstations/

  516. https://gvisor.dev/docs/user_guide/faq/

  517. https://dev.to/rimelek/using-gvisors-container-runtime-in-docker-desktop-374m

  518. https://docs.cloud.google.com/workstations/docs/access-control

  519. https://docs.cloud.google.com/iam/docs/roles-permissions/workstations

  520. https://cloud.google.com/workstations/docs/access-control

  521. https://docs.cloud.google.com/workstations/docs/architecture 2

  522. https://cloud.google.com/workstations/docs/architecture 2

  523. https://gvisor.dev/docs/user_guide/install/ 2

  524. https://docs.readthedocs.com/dev/latest/guides/gvisor.html 2

  525. https://gvisor.dev/docs/user_guide/quick_start/docker/ 2

  526. https://codelabs.developers.google.com/grpc/getting-started-grpc-go-streaming 2

  527. https://programmingpercy.tech/blog/streaming-data-with-grpc/

  528. https://dev.to/ramonberrutti/grpc-streaming-best-practices-and-performance-insights-219g 2 3

  529. https://dzone.com/articles/advanced-grpc-in-microservices 2

  530. https://victoriametrics.com/blog/go-grpc-basic-streaming-interceptor/

  531. https://github.com/google/gvisor/issues/190

  532. https://github.com/Datadog/datadog-agent/issues/44084

  533. https://gvisor.dev/docs/architecture_guide/performance/ 2 3 4

  534. https://gvisor.dev/docs/user_guide/faq/

  535. https://gvisor.dev/docs/tutorials/docker-in-gvisor/ 2 3

  536. https://grpc.io/docs/guides/performance/

  537. https://gvisor.dev/docs/user_guide/compatibility/

  538. http://contextkeeper.io/blog/continuous-auto-saving-branch-snapshots-and-git-worktree-support/

  539. https://git-scm.com/docs/git-worktree

  540. https://nx.dev/blog/git-worktrees-ai-agents

  541. https://www.datacamp.com/pt/tutorial/git-worktree-tutorial

  542. https://gvisor.dev/docs/user_guide/production/

  543. https://github.com/google/gvisor

  544. https://gvisor.dev/docs/user_guide/compatibility/ 2

  545. https://gvisor.dev/docs/architecture_guide/performance/ 2

  546. https://gvisor.dev/docs/tutorials/docker-in-gvisor/ 2

  547. https://dev.to/rimelek/using-gvisors-container-runtime-in-docker-desktop-374m

  548. https://gvisor.dev/docs/user_guide/faq/ 2 3

  549. https://github.com/google/gvisor/issues/9368

  550. https://devsecops.puziol.com.br/kubernetes/cks/solved-questions/question-10-gvisor-runtime-sandbox/

  551. https://github.com/kubernetes/minikube/issues/5463

  552. https://docs.cloud.google.com/workstations/docs/architecture

  553. https://cloud.google.com/workstations/docs/architecture