ADR-CXQ-009: Multi-Tenant Platform Architecture for CODITECT Intelligence Layer
Status: Proposed Date: 2025-12-18 Deciders: Hal Casteel (Founder/CEO/CTO), CODITECT Core Team Technical Story: Transform coditect-core from single-user local tool to multi-tenant platform intelligence engine
Executive Summary
Context: CODITECT is evolving from a local development tool to a multi-tenant SaaS platform serving:
- Internal Development - Single-user local Claude Code enhancement
- Distributed Intelligence - Powers 57+ submodules via
.coditectsymlinks - Platform Engine - Drives customer projects, products, and businesses
- Enterprise SaaS - Multi-tenant organization-based licensing
Decision: Implement PostgreSQL + FoundationDB architecture for the platform layer while maintaining SQLite for local-first offline development.
Current State Analysis
CODITECT-CORE Components (December 2025)
| Category | Count | Multi-Tenant Impact |
|---|---|---|
| Agents | 119 | Need tenant-scoped memory |
| Commands | 128 | Need tenant context injection |
| Skills | 79 | Need workspace isolation |
| Scripts | 195 | Need tenant-aware configuration |
| Hooks | 37 | Need tenant event routing |
| Workflows | 1,149 | Need tenant-scoped execution |
| Total | 1,712 | All require tenant awareness |
Current Storage (Single-Tenant)
context-storage/
├── context.db (835 MB) # SQLite - NO tenant_id
├── unified_messages.jsonl (124 MB) # Flat file - NO isolation
├── checkpoints/ (82 checkpoints) # NO organization scoping
├── sessions/ (23 sessions) # NO user isolation
└── exports-archive/ (503 files) # NO access control
Database Schema Gap Analysis
Current Tables (NO multi-tenancy):
-- CURRENT: No tenant isolation
CREATE TABLE messages (
id INTEGER PRIMARY KEY,
hash TEXT UNIQUE,
content TEXT,
role TEXT,
session_id TEXT, -- ⚠️ No organization_id
...
);
CREATE TABLE decisions (
id INTEGER PRIMARY KEY,
message_id INTEGER,
decision_text TEXT, -- ⚠️ No tenant isolation
...
);
CREATE TABLE code_patterns (
id INTEGER PRIMARY KEY,
pattern_text TEXT, -- ⚠️ Visible to all users
...
);
Platform Architecture Vision
Three-Tier Deployment Model
┌─────────────────────────────────────────────────────────────────────────┐
│ CODITECT PLATFORM TIERS │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ TIER 1: LOCAL DEVELOPMENT (Offline-First) │
│ ├── SQLite + FTS5 (context.db) │
│ ├── Single-user, single-machine │
│ ├── Full functionality without network │
│ └── Syncs to cloud when connected │
│ │
│ TIER 2: TEAM/WORKSPACE (Connected) │
│ ├── PostgreSQL (shared team database) │
│ ├── Multi-user within organization │
│ ├── Real-time collaboration │
│ └── Shared patterns, decisions, context │
│ │
│ TIER 3: ENTERPRISE PLATFORM (Multi-Tenant SaaS) │
│ ├── PostgreSQL + pgvector (primary storage) │
│ ├── FoundationDB (distributed state, tenant isolation) │
│ ├── Redis (caching, session tracking) │
│ ├── Neo4j/PostgreSQL Graph (knowledge graphs) - Future │
│ └── Organization-based complete isolation │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Data Flow Architecture
┌──────────────────────────────────────────────────────────────────────────┐
│ MULTI-TENANT DATA FLOW │
├──────────────────────────────────────────────────────────────────────────┤
│ │
│ CUSTOMER A CUSTOMER B CUSTOMER C │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Local │ │ Local │ │ Local │ │
│ │ SQLite │ │ SQLite │ │ SQLite │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │
│ │ Sync API │ Sync API │ │
│ ▼ ▼ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ CODITECT CLOUD PLATFORM │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ django-multitenant Middleware │ │ │
│ │ │ (Automatic tenant_id filtering on ALL queries) │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ PostgreSQL + pgvector │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │
│ │ │ │ Org A Data │ │ Org B Data │ │ Org C Data │ │ │ │
│ │ │ │ tenant_id=A │ │ tenant_id=B │ │ tenant_id=C │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ FoundationDB │ │ │
│ │ │ (Distributed state, real-time sync, tenant isolation) │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────┘
Multi-Tenant Schema Design
Organization Model (Already Exists)
# tenants/models.py (coditect-cloud-backend)
class Organization(models.Model):
id = models.UUIDField(primary_key=True)
name = models.CharField(max_length=255)
slug = models.SlugField(unique=True)
plan = models.CharField(choices=['FREE', 'PRO', 'ENTERPRISE'])
stripe_customer_id = models.CharField(unique=True)
# ... billing fields
Context Database Schema (PostgreSQL)
-- =============================================================================
-- TENANT-AWARE CONTEXT DATABASE SCHEMA
-- Extends ADR-CXQ-007 with organization-based isolation
-- =============================================================================
-- Extension for vector search
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm; -- Fuzzy text search
-- =============================================================================
-- CORE TABLES (Tenant-Scoped)
-- =============================================================================
-- Messages (session history)
CREATE TABLE cxq_messages (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
user_id UUID REFERENCES users(id),
-- Content
hash TEXT NOT NULL,
content TEXT NOT NULL,
role TEXT NOT NULL CHECK (role IN ('user', 'assistant', 'system')),
-- Source tracking
source_type TEXT,
source_file TEXT,
session_id TEXT,
checkpoint TEXT,
-- Timestamps
message_timestamp TIMESTAMPTZ,
extracted_at TIMESTAMPTZ DEFAULT NOW(),
-- Metadata
content_length INTEGER,
has_code BOOLEAN DEFAULT FALSE,
has_markdown BOOLEAN DEFAULT FALSE,
token_count INTEGER,
-- Uniqueness per tenant
UNIQUE(organization_id, hash)
);
CREATE INDEX idx_messages_org ON cxq_messages(organization_id);
CREATE INDEX idx_messages_session ON cxq_messages(organization_id, session_id);
CREATE INDEX idx_messages_role ON cxq_messages(organization_id, role);
CREATE INDEX idx_messages_ts ON cxq_messages(organization_id, message_timestamp);
-- Full-text search
CREATE INDEX idx_messages_fts ON cxq_messages
USING GIN (to_tsvector('english', content));
-- =============================================================================
-- KNOWLEDGE EXTRACTION (Tenant-Scoped)
-- =============================================================================
-- Decisions extracted from conversations
CREATE TABLE cxq_decisions (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
message_id BIGINT REFERENCES cxq_messages(id) ON DELETE CASCADE,
decision_text TEXT NOT NULL,
context TEXT,
category TEXT,
confidence FLOAT,
-- Sharing controls
is_shared BOOLEAN DEFAULT FALSE, -- Visible to org members
is_template BOOLEAN DEFAULT FALSE, -- Promoted to org-wide pattern
extracted_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_decisions_org ON cxq_decisions(organization_id);
CREATE INDEX idx_decisions_fts ON cxq_decisions
USING GIN (to_tsvector('english', decision_text || ' ' || COALESCE(context, '')));
-- Code patterns extracted from responses
CREATE TABLE cxq_code_patterns (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
message_id BIGINT REFERENCES cxq_messages(id) ON DELETE CASCADE,
language TEXT,
pattern_text TEXT NOT NULL,
pattern_hash TEXT NOT NULL,
-- Pattern metadata
pattern_type TEXT, -- function, class, config, etc.
quality_score FLOAT,
usage_count INTEGER DEFAULT 1,
-- Sharing
is_shared BOOLEAN DEFAULT FALSE,
is_template BOOLEAN DEFAULT FALSE,
extracted_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(organization_id, pattern_hash)
);
CREATE INDEX idx_patterns_org ON cxq_code_patterns(organization_id);
CREATE INDEX idx_patterns_lang ON cxq_code_patterns(organization_id, language);
-- =============================================================================
-- CODEBASE INDEXING (Tenant-Scoped) - From ADR-CXQ-007
-- =============================================================================
-- Source files indexed per organization
CREATE TABLE cxq_source_files (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
workspace_id UUID, -- Optional: for multi-workspace orgs
file_hash TEXT NOT NULL,
relative_path TEXT NOT NULL,
absolute_path TEXT,
language TEXT NOT NULL,
file_type TEXT, -- source, test, config, script
line_count INTEGER DEFAULT 0,
byte_size INTEGER DEFAULT 0,
content_hash TEXT,
last_modified TIMESTAMPTZ,
indexed_at TIMESTAMPTZ DEFAULT NOW(),
-- Parsed metadata
has_imports BOOLEAN DEFAULT FALSE,
has_exports BOOLEAN DEFAULT FALSE,
has_classes BOOLEAN DEFAULT FALSE,
has_functions BOOLEAN DEFAULT FALSE,
import_count INTEGER DEFAULT 0,
function_count INTEGER DEFAULT 0,
UNIQUE(organization_id, workspace_id, relative_path)
);
CREATE INDEX idx_files_org ON cxq_source_files(organization_id);
CREATE INDEX idx_files_workspace ON cxq_source_files(organization_id, workspace_id);
CREATE INDEX idx_files_lang ON cxq_source_files(organization_id, language);
-- Symbols extracted from source files
CREATE TABLE cxq_symbols (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
file_id BIGINT NOT NULL REFERENCES cxq_source_files(id) ON DELETE CASCADE,
symbol_hash TEXT NOT NULL,
name TEXT NOT NULL,
qualified_name TEXT,
symbol_type TEXT NOT NULL, -- function, class, method, variable, etc.
-- Location
line_start INTEGER NOT NULL,
line_end INTEGER,
column_start INTEGER,
column_end INTEGER,
-- Metadata
visibility TEXT, -- public, private, protected
is_async BOOLEAN DEFAULT FALSE,
is_exported BOOLEAN DEFAULT FALSE,
-- Documentation
docstring TEXT,
signature TEXT,
return_type TEXT,
parameters JSONB,
-- Relationships
parent_symbol_id BIGINT REFERENCES cxq_symbols(id),
indexed_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(organization_id, symbol_hash)
);
CREATE INDEX idx_symbols_org ON cxq_symbols(organization_id);
CREATE INDEX idx_symbols_file ON cxq_symbols(file_id);
CREATE INDEX idx_symbols_name ON cxq_symbols(organization_id, name);
CREATE INDEX idx_symbols_type ON cxq_symbols(organization_id, symbol_type);
CREATE INDEX idx_symbols_fts ON cxq_symbols
USING GIN (to_tsvector('english', name || ' ' || COALESCE(docstring, '')));
-- Cross-references (symbol usages)
CREATE TABLE cxq_cross_references (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
symbol_id BIGINT NOT NULL REFERENCES cxq_symbols(id) ON DELETE CASCADE,
file_id BIGINT NOT NULL REFERENCES cxq_source_files(id) ON DELETE CASCADE,
line_number INTEGER NOT NULL,
column_number INTEGER,
ref_type TEXT NOT NULL, -- call, instantiation, read, write, type_annotation
context_line TEXT,
indexed_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_xref_org ON cxq_cross_references(organization_id);
CREATE INDEX idx_xref_symbol ON cxq_cross_references(symbol_id);
CREATE INDEX idx_xref_file ON cxq_cross_references(file_id);
-- =============================================================================
-- SEMANTIC EMBEDDINGS (Tenant-Scoped)
-- =============================================================================
CREATE TABLE cxq_embeddings (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
source_type TEXT NOT NULL, -- message, decision, pattern, symbol, docstring
source_id BIGINT NOT NULL,
embedding vector(384) NOT NULL, -- all-MiniLM-L6-v2 dimensions
model TEXT NOT NULL DEFAULT 'all-MiniLM-L6-v2',
source_text TEXT, -- For regeneration
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_emb_org ON cxq_embeddings(organization_id);
CREATE INDEX idx_emb_source ON cxq_embeddings(source_type, source_id);
-- Vector similarity index (HNSW for fast approximate search)
CREATE INDEX idx_emb_vector ON cxq_embeddings
USING hnsw (embedding vector_cosine_ops);
-- =============================================================================
-- WORKSPACE REGISTRY (Tenant-Scoped)
-- =============================================================================
CREATE TABLE cxq_workspaces (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
organization_id UUID NOT NULL REFERENCES organizations(id),
name TEXT NOT NULL,
slug TEXT NOT NULL,
root_path TEXT,
-- Git integration
git_url TEXT,
git_branch TEXT,
last_sync TIMESTAMPTZ,
-- Configuration
config JSONB DEFAULT '{}',
-- Stats
file_count INTEGER DEFAULT 0,
symbol_count INTEGER DEFAULT 0,
last_indexed TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(organization_id, slug)
);
CREATE INDEX idx_workspace_org ON cxq_workspaces(organization_id);
-- =============================================================================
-- DOCUMENT INDEX (Tenant-Scoped)
-- =============================================================================
CREATE TABLE cxq_documents (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
workspace_id UUID REFERENCES cxq_workspaces(id),
file_path TEXT NOT NULL,
file_hash TEXT NOT NULL,
title TEXT,
first_heading TEXT,
summary TEXT,
category TEXT,
subcategory TEXT,
tags TEXT[],
-- Frontmatter
frontmatter JSONB,
-- Content stats
word_count INTEGER,
line_count INTEGER,
-- Timestamps
file_modified TIMESTAMPTZ,
indexed_at TIMESTAMPTZ DEFAULT NOW(),
UNIQUE(organization_id, workspace_id, file_path)
);
CREATE INDEX idx_doc_org ON cxq_documents(organization_id);
CREATE INDEX idx_doc_workspace ON cxq_documents(organization_id, workspace_id);
CREATE INDEX idx_doc_category ON cxq_documents(organization_id, category);
CREATE INDEX idx_doc_fts ON cxq_documents
USING GIN (to_tsvector('english', title || ' ' || COALESCE(summary, '')));
Component Impact Analysis
1. Agents (119 Total)
Impact Level: HIGH
| Agent Category | Count | Multi-Tenant Requirement |
|---|---|---|
| Research | 15 | Tenant-scoped memory queries |
| Development | 25 | Tenant-isolated code access |
| DevOps | 18 | Tenant infrastructure isolation |
| Security | 12 | Tenant compliance boundaries |
| Documentation | 14 | Tenant document catalogs |
| Orchestration | 8 | Tenant workflow isolation |
| Other | 27 | Context-dependent |
Required Changes:
# Agent context injection
class TenantAwareAgent:
def __init__(self, organization_id: UUID, user_id: UUID):
self.org_id = organization_id
self.user_id = user_id
self.memory = TenantScopedMemory(organization_id)
def query_memory(self, query: str) -> List[Dict]:
# All queries automatically filtered by organization_id
return self.memory.search(query)
2. Commands (128 Total)
Impact Level: MEDIUM-HIGH
| Command Category | Multi-Tenant Impact |
|---|---|
| /cxq, /cx | Must query tenant-scoped database |
| /git-sync | Must sync to tenant repositories |
| /session-* | Must isolate session data |
| /doc-* | Must query tenant document catalog |
| /research-* | Must use tenant memory |
| /deploy-* | Must use tenant infrastructure |
Required Changes:
# commands/cxq.md - Updated for multi-tenant
## Multi-Tenant Support
All /cxq commands are automatically scoped to the current organization.
### Cross-Organization Search (Enterprise Only)
/cxq --all-orgs "query" # Requires super-admin privileges
/cxq --org ORG_ID "query" # Query specific organization
### Workspace Scoping
/cxq --workspace NAME "query" # Specific workspace
/cxq --all-workspaces "query" # All org workspaces
3. Skills (79 Total)
Impact Level: MEDIUM
Skills are stateless patterns but need tenant context when:
- Accessing memory system
- Reading/writing files
- Executing workflows
- Generating reports
4. Scripts (195 Total)
Impact Level: HIGH
| Script Category | Multi-Tenant Requirement |
|---|---|
| context-db.py | Must support PostgreSQL + tenant filtering |
| session-*.py | Must isolate sessions by organization |
| export-*.py | Must scope exports to organization |
| backup-*.py | Must backup per-tenant data separately |
| sync-*.py | Must sync to tenant repositories |
Required Changes:
# scripts/context-db.py - PostgreSQL variant
class MultiTenantContextDB:
def __init__(self, organization_id: UUID):
self.org_id = organization_id
self.conn = get_postgresql_connection()
def search(self, query: str) -> List[Dict]:
# Automatic tenant filtering
return self.conn.execute("""
SELECT * FROM cxq_messages
WHERE organization_id = %s
AND to_tsvector('english', content) @@ plainto_tsquery(%s)
ORDER BY message_timestamp DESC
""", [self.org_id, query])
5. Workflows (1,149 Total)
Impact Level: HIGH
All N8N-compatible workflows need tenant context:
# workflows/research/competitive-analysis.yaml
# UPDATED for multi-tenant
input:
organization_id: "{{org_id}}"
user_id: "{{user_id}}"
workspace_id: "{{workspace_id}}"
steps:
- agent: web-search-researcher
context:
organization_id: "{{organization_id}}"
memory_scope: organization # or 'user', 'workspace', 'global'
Distributed Intelligence Impact
Current: Single-Tenant Symlinks
repo/.coditect → submodules/core/coditect-core
repo/.claude → .coditect
Future: Tenant-Aware Intelligence
repo/.coditect → submodules/core/coditect-core
│
├── .coditect/tenant-config.json
│ {
│ "organization_id": "uuid",
│ "workspace_id": "uuid",
│ "sync_enabled": true,
│ "memory_mode": "hybrid" // local + cloud
│ }
│
└── .coditect/local-cache/
├── context.db (SQLite - offline cache)
└── sync-state.json (last sync timestamp)
Customer Project Isolation Model
┌─────────────────────────────────────────────────────────────────────┐
│ CUSTOMER PROJECT ISOLATION │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ORGANIZATION: Acme Corp (tenant_id: abc-123) │
│ ├── WORKSPACE: Main Product │
│ │ ├── Codebase: 15,000 files │
│ │ ├── Symbols: 500,000 │
│ │ ├── Memory: 50,000 messages │
│ │ └── Patterns: 10,000 code patterns │
│ │ │
│ ├── WORKSPACE: Mobile App │
│ │ ├── Codebase: 8,000 files │
│ │ └── Isolated from Main Product │
│ │ │
│ └── WORKSPACE: Infrastructure │
│ └── Shared patterns with Main Product (cross-workspace) │
│ │
│ ───────────────────────────────────────────────────────────────── │
│ │
│ ORGANIZATION: Beta Inc (tenant_id: def-456) │
│ ├── COMPLETE ISOLATION from Acme Corp │
│ ├── Cannot see Acme's code, patterns, or memory │
│ └── Own workspaces, own data, own encryption keys │
│ │
└─────────────────────────────────────────────────────────────────────┘
Implementation Phases
Phase 1: Database Foundation (Weeks 1-2)
- Create PostgreSQL schema in coditect-cloud-backend
- Add cxq_* tables with organization_id
- Implement django-multitenant integration
- Create migration scripts from SQLite → PostgreSQL
- Add pgvector extension for embeddings
Phase 2: API Layer (Weeks 3-4)
- Create /api/v1/cxq/* endpoints
- Implement tenant-scoped search
- Add workspace management endpoints
- Create sync API for local ↔ cloud
Phase 3: Local-Cloud Sync (Weeks 5-6)
- Implement SQLite → PostgreSQL sync client
- Add offline-first conflict resolution
- Create background sync daemon
- Add sync status indicators
Phase 4: Component Updates (Weeks 7-8)
- Update context-db.py for dual-mode (SQLite/PostgreSQL)
- Update all /cxq commands for tenant awareness
- Update agent memory access patterns
- Update workflow execution context
Phase 5: Testing & Migration (Weeks 9-10)
- Multi-tenant isolation testing
- Performance benchmarking
- Security audit
- Beta customer migration
Security Considerations
Data Isolation Requirements
| Level | Mechanism | Enforcement |
|---|---|---|
| Organization | organization_id FK | Database-level, django-multitenant |
| Workspace | workspace_id FK | Application-level filtering |
| User | user_id FK | Row-level security |
| Encryption | Per-tenant keys | Cloud KMS integration |
Compliance
- SOC 2 Type II - Audit logging, access controls
- GDPR - Data residency, right to deletion
- HIPAA - Encryption at rest, access logging (enterprise tier)
Success Metrics
| Metric | Target | Measurement |
|---|---|---|
| Tenant Isolation | 100% | No cross-tenant data leakage |
| Query Latency P95 | < 100ms | PostgreSQL monitoring |
| Sync Latency | < 5s | Local → cloud sync time |
| Offline Capability | 100% | Full function without network |
| Migration Success | 99%+ | Data integrity verification |
Decision
Primary Decision: Implement Hybrid Architecture
Rationale:
- Local-first - SQLite for offline development (essential for developers)
- Cloud-native - PostgreSQL for multi-tenant SaaS platform
- Sync layer - Bidirectional sync with conflict resolution
- Existing infrastructure - Leverages coditect-cloud-backend PostgreSQL + FoundationDB
Technology Stack
| Layer | Technology | Purpose |
|---|---|---|
| Local Storage | SQLite + FTS5 | Offline-first, single-user |
| Cloud Primary | PostgreSQL + pgvector | Multi-tenant, vector search |
| Distributed State | FoundationDB | Real-time sync, tenant isolation |
| Caching | Redis | Hot queries, session tracking |
| Future | Neo4j | Knowledge graphs (enterprise) |
Trigger Points for Review
- First enterprise customer onboarding
- 100+ concurrent users
- Cross-organization collaboration requirements
- Knowledge graph demand validation
References
- ADR-CXQ-007: Codebase Indexing Expansion
- ADR-CXQ-008: SQLite Scalability Analysis
- coditect-cloud-backend/tenants/models.py
- django-multitenant Documentation
- pgvector Documentation
Decision Status: PROPOSED Review Required By: Engineering Lead, Security Lead Target Implementation: Q1 2026