ADR-012: CODITECT Data Isolation Strategy
Statusβ
ACCEPTED (2026-02-03)
Contextβ
CODITECT serves multiple customer organizations (tenants) on shared infrastructure while requiring:
- Complete data separation between tenants
- Compliance readiness for SOC 2, GDPR, HIPAA
- Defense in depth - multiple isolation layers
- Performance - isolation without overhead
- Auditability - prove isolation to customers
Threat Modelβ
| Threat | Impact | Mitigation Required |
|---|---|---|
| Cross-tenant data leak | Critical | Database-level enforcement |
| SQL injection bypass | Critical | RLS policies (not app-level) |
| Insider access abuse | High | Audit logging, IAM |
| Shared resource contention | Medium | Resource quotas |
| Backup data exposure | High | Encrypted, tenant-scoped |
Decisionβ
Architecture: Defense in Depth Isolationβ
Implement four layers of data isolation to ensure no single point of failure:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CODITECT DATA ISOLATION LAYERS β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β LAYER 1: NETWORK ISOLATION β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β GCP VPC with private subnets β β
β β Cloud NAT for egress-only internet β β
β β Private Service Connect for GCP services β β
β β No direct database access from internet β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β LAYER 2: APPLICATION ISOLATION β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Django middleware sets tenant context from JWT β β
β β django-multitenant auto-filters all queries β β
β β Request-scoped tenant_id (cannot change mid-request) β β
β β API endpoints validate tenant ownership β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β LAYER 3: DATABASE ISOLATION (PostgreSQL RLS) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Row-Level Security policies on ALL tenant tables β β
β β Session variable: SET app.tenant_id = 'uuid' β β
β β RLS policy: USING (tenant_id = current_setting('app.tenant_id')) β β
β β Database enforces even if app layer bypassed β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β LAYER 4: STORAGE ISOLATION β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Per-tenant GCS buckets (gs://coditect-{tenant-slug}-data) β β
β β Per-tenant service accounts with bucket-only access β β
β β Customer-managed encryption keys (CMEK) optional β β
β β Cross-tenant bucket access impossible via IAM β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
PostgreSQL Row-Level Security Implementationβ
Base Policy Pattern:
-- Enable RLS on tenant table
ALTER TABLE context_messages ENABLE ROW LEVEL SECURITY;
-- Force RLS even for table owners (security best practice)
ALTER TABLE context_messages FORCE ROW LEVEL SECURITY;
-- SELECT policy - users can only read their tenant's data
CREATE POLICY tenant_isolation_select ON context_messages
FOR SELECT
USING (tenant_id = current_setting('app.tenant_id')::UUID);
-- INSERT policy - users can only insert to their tenant
CREATE POLICY tenant_isolation_insert ON context_messages
FOR INSERT
WITH CHECK (tenant_id = current_setting('app.tenant_id')::UUID);
-- UPDATE policy - users can only update their tenant's data
CREATE POLICY tenant_isolation_update ON context_messages
FOR UPDATE
USING (tenant_id = current_setting('app.tenant_id')::UUID)
WITH CHECK (tenant_id = current_setting('app.tenant_id')::UUID);
-- DELETE policy - users can only delete their tenant's data
CREATE POLICY tenant_isolation_delete ON context_messages
FOR DELETE
USING (tenant_id = current_setting('app.tenant_id')::UUID);
Django Middleware Integration:
class TenantIsolationMiddleware:
"""
Sets PostgreSQL session variable for RLS enforcement.
Security: Even if application code has bugs, database
enforces tenant isolation at the row level.
"""
def __call__(self, request):
# Extract tenant_id from validated JWT token
tenant_id = request.auth.get('tenant_id')
user_id = request.auth.get('user_id')
if not tenant_id:
raise PermissionDenied("No tenant context")
# Set PostgreSQL session variables for RLS
with connection.cursor() as cursor:
cursor.execute(
"SET app.tenant_id = %s; SET app.user_id = %s;",
[str(tenant_id), str(user_id)]
)
# Attach to request for application use
request.tenant_id = tenant_id
request.user_id = user_id
return self.get_response(request)
RLS Policy Flowβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RLS POLICY ENFORCEMENT FLOW β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 1. Request arrives with JWT token β
β β β
β βΌ β
β 2. Middleware extracts tenant_id from JWT β
β β β
β βΌ β
β 3. SET app.tenant_id = 'tenant-uuid' β
β β β
β βΌ β
β 4. Application executes query: SELECT * FROM messages β
β β β
β βΌ β
β 5. PostgreSQL RLS automatically rewrites to: β
β SELECT * FROM messages WHERE tenant_id = 'tenant-uuid' β
β β β
β βΌ β
β 6. Only tenant's data returned (database-enforced) β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ATTEMPTED BYPASS (SQL Injection): β
β β
β Attacker tries: SELECT * FROM messages WHERE tenant_id = 'other-tenant' β
β β β
β βΌ β
β RLS policy STILL enforced: β
β SELECT * FROM messages β
β WHERE tenant_id = 'other-tenant' β
β AND tenant_id = current_setting('app.tenant_id') β RLS adds this β
β β β
β βΌ β
β Result: Empty set (conditions contradict) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Tables with RLS Enforcementβ
| Table | tenant_id Column | RLS Enabled | Indexed |
|---|---|---|---|
organizations | id (self) | Yes | Primary |
teams | tenant_id | Yes | Yes |
users | tenant_id | Yes | Yes |
projects | tenant_id | Yes | Yes |
sessions | tenant_id | Yes | Yes |
messages | tenant_id (via session) | Yes | Yes |
context_messages | tenant_id | Yes | Yes |
task_tracking | tenant_id | Yes | Yes |
sync_cursors | tenant_id | Yes | Yes |
decisions | tenant_id | Yes | Yes |
skill_learnings | tenant_id | Yes | Yes |
Local Database Isolation (ADR-118)β
For local installations, isolation is per-machine via filesystem:
~/PROJECTS/.coditect-data/context-storage/
βββ platform.db # Tier 1: Read-only component data (shared)
βββ org.db # Tier 2: User's organization data (isolated)
βββ sessions.db # Tier 3: User's session data (isolated)
βββ projects.db # Tier 4: Project-specific data (isolated)
Local Isolation Properties:
- File permissions:
0600(owner read/write only) - No network exposure (local-only)
- Machine-specific UUIDs prevent cross-device conflicts
- Cloud sync requires authentication (ADR-053)
GCS Storage Isolationβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GCS BUCKET ISOLATION β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Per-Tenant Bucket Structure: β
β β
β gs://coditect-acme-corp-data/ β Tenant: Acme Corp β
β βββ exports/ β
β βββ backups/ β
β βββ artifacts/ β
β β
β gs://coditect-beta-inc-data/ β Tenant: Beta Inc β
β βββ exports/ β
β βββ backups/ β
β βββ artifacts/ β
β β
β IAM Policy (per bucket): β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Principal: coditect-acme-corp-sa@project.iam.gserviceaccount.com β β
β β Role: roles/storage.objectAdmin β β
β β Resource: gs://coditect-acme-corp-data/* β β
β β β β
β β NO access to gs://coditect-beta-inc-data/* (different SA) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Audit Loggingβ
All data access is logged for compliance:
# Audit log entry structure
{
"timestamp": "2026-02-03T12:00:00Z",
"tenant_id": "tenant-uuid",
"user_id": "user-uuid",
"action": "SELECT",
"table": "context_messages",
"row_count": 42,
"source_ip": "10.0.0.1",
"user_agent": "CODITECT/2.0"
}
Audit Storage:
- Cloud Logging (30-day retention, searchable)
- BigQuery export (long-term compliance)
- Immutable audit trail
Consequencesβ
Positiveβ
- Database-enforced isolation - Cannot be bypassed by application bugs
- Defense in depth - Four independent layers
- Compliance-ready - SOC 2, GDPR, HIPAA evidence
- Auditability - Complete access logging
- Performance - RLS uses efficient index scans
- Simplicity - django-multitenant handles complexity
Negativeβ
- Complexity - Multiple isolation mechanisms to maintain
- Testing overhead - Must verify isolation in all layers
- Migration complexity - Adding RLS to existing tables
- Debugging difficulty - Must set tenant context for queries
Risksβ
| Risk | Mitigation |
|---|---|
| RLS policy misconfiguration | Automated tests, security audit |
| Session variable not set | Middleware enforces before any query |
| Backup contains other tenant data | Per-tenant backup jobs |
| Admin bypasses isolation | Separate admin audit log |
Testing Strategyβ
Isolation Test Suiteβ
class TenantIsolationTests(TestCase):
"""Verify tenant isolation at all layers."""
def test_rls_prevents_cross_tenant_select(self):
"""Tenant A cannot read Tenant B's data."""
tenant_a = create_tenant("Acme")
tenant_b = create_tenant("Beta")
# Create message for Tenant A
set_tenant(tenant_a)
msg = Message.objects.create(content="secret")
# Switch to Tenant B
set_tenant(tenant_b)
# Should not find Tenant A's message
self.assertEqual(Message.objects.count(), 0)
# Direct query should also fail
with connection.cursor() as c:
c.execute(f"SELECT * FROM messages WHERE id = '{msg.id}'")
self.assertEqual(c.fetchall(), [])
def test_rls_prevents_cross_tenant_update(self):
"""Tenant A cannot modify Tenant B's data."""
# ... similar pattern
def test_gcs_bucket_isolation(self):
"""Service account cannot access other tenant's bucket."""
# ... IAM verification
Implementationβ
Phase 1: Database RLS (Complete)β
- PostgreSQL RLS policies on all tenant tables
- django-multitenant integration
- Middleware for session variable
Phase 2: Storage Isolation (Complete)β
- Per-tenant GCS buckets
- Per-tenant service accounts
- IAM policy automation
Phase 3: Audit Logging (In Progress)β
- Cloud Logging integration
- BigQuery export
- Compliance dashboards
Phase 4: Compliance Certification (Planned)β
- SOC 2 Type II audit
- Penetration testing
- Security documentation
Relatedβ
- ADR-009: Multi-Tenant Architecture
- ADR-010: Cloud Workstations Architecture
- ADR-011: Authentication Strategy
- ADR-044: Custom REST Sync Architecture
- ADR-053: Cloud Context Sync Architecture
- ADR-089: Two-Database Architecture
- ADR-103: Four-Database Separation
- ADR-118: Four-Tier Database Architecture
Diagramsβ
diagrams/mermaid/adr012-isolation-layers.mmddiagrams/mermaid/adr012-rls-policy-flow.mmddiagrams/mermaid/adr012-tenant-boundary.mmddiagrams/mermaid/adr012-sharding-strategy.mmd
Track: D (Security Hardening) Task: F.12.2.4