Skip to main content

ADR-CXQ-009: Multi-Tenant Platform Architecture for CODITECT Intelligence Layer

Status: Proposed Date: 2025-12-18 Deciders: Hal Casteel (Founder/CEO/CTO), CODITECT Core Team Technical Story: Transform coditect-core from single-user local tool to multi-tenant platform intelligence engine


Executive Summary

Context: CODITECT is evolving from a local development tool to a multi-tenant SaaS platform serving:

  1. Internal Development - Single-user local Claude Code enhancement
  2. Distributed Intelligence - Powers 57+ submodules via .coditect symlinks
  3. Platform Engine - Drives customer projects, products, and businesses
  4. Enterprise SaaS - Multi-tenant organization-based licensing

Decision: Implement PostgreSQL + FoundationDB architecture for the platform layer while maintaining SQLite for local-first offline development.


Current State Analysis

CODITECT-CORE Components (December 2025)

CategoryCountMulti-Tenant Impact
Agents119Need tenant-scoped memory
Commands128Need tenant context injection
Skills79Need workspace isolation
Scripts195Need tenant-aware configuration
Hooks37Need tenant event routing
Workflows1,149Need tenant-scoped execution
Total1,712All require tenant awareness

Current Storage (Single-Tenant)

context-storage/
├── context.db (835 MB) # SQLite - NO tenant_id
├── unified_messages.jsonl (124 MB) # Flat file - NO isolation
├── checkpoints/ (82 checkpoints) # NO organization scoping
├── sessions/ (23 sessions) # NO user isolation
└── exports-archive/ (503 files) # NO access control

Database Schema Gap Analysis

Current Tables (NO multi-tenancy):

-- CURRENT: No tenant isolation
CREATE TABLE messages (
id INTEGER PRIMARY KEY,
hash TEXT UNIQUE,
content TEXT,
role TEXT,
session_id TEXT, -- ⚠️ No organization_id
...
);

CREATE TABLE decisions (
id INTEGER PRIMARY KEY,
message_id INTEGER,
decision_text TEXT, -- ⚠️ No tenant isolation
...
);

CREATE TABLE code_patterns (
id INTEGER PRIMARY KEY,
pattern_text TEXT, -- ⚠️ Visible to all users
...
);

Platform Architecture Vision

Three-Tier Deployment Model

┌─────────────────────────────────────────────────────────────────────────┐
│ CODITECT PLATFORM TIERS │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ TIER 1: LOCAL DEVELOPMENT (Offline-First) │
│ ├── SQLite + FTS5 (context.db) │
│ ├── Single-user, single-machine │
│ ├── Full functionality without network │
│ └── Syncs to cloud when connected │
│ │
│ TIER 2: TEAM/WORKSPACE (Connected) │
│ ├── PostgreSQL (shared team database) │
│ ├── Multi-user within organization │
│ ├── Real-time collaboration │
│ └── Shared patterns, decisions, context │
│ │
│ TIER 3: ENTERPRISE PLATFORM (Multi-Tenant SaaS) │
│ ├── PostgreSQL + pgvector (primary storage) │
│ ├── FoundationDB (distributed state, tenant isolation) │
│ ├── Redis (caching, session tracking) │
│ ├── Neo4j/PostgreSQL Graph (knowledge graphs) - Future │
│ └── Organization-based complete isolation │
│ │
└─────────────────────────────────────────────────────────────────────────┘

Data Flow Architecture

┌──────────────────────────────────────────────────────────────────────────┐
│ MULTI-TENANT DATA FLOW │
├──────────────────────────────────────────────────────────────────────────┤
│ │
│ CUSTOMER A CUSTOMER B CUSTOMER C │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Local │ │ Local │ │ Local │ │
│ │ SQLite │ │ SQLite │ │ SQLite │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │
│ │ Sync API │ Sync API │ │
│ ▼ ▼ ▼ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ CODITECT CLOUD PLATFORM │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ django-multitenant Middleware │ │ │
│ │ │ (Automatic tenant_id filtering on ALL queries) │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ PostgreSQL + pgvector │ │ │
│ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │
│ │ │ │ Org A Data │ │ Org B Data │ │ Org C Data │ │ │ │
│ │ │ │ tenant_id=A │ │ tenant_id=B │ │ tenant_id=C │ │ │ │
│ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
│ │ │ FoundationDB │ │ │
│ │ │ (Distributed state, real-time sync, tenant isolation) │ │ │
│ │ └──────────────────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────┘

Multi-Tenant Schema Design

Organization Model (Already Exists)

# tenants/models.py (coditect-cloud-backend)
class Organization(models.Model):
id = models.UUIDField(primary_key=True)
name = models.CharField(max_length=255)
slug = models.SlugField(unique=True)
plan = models.CharField(choices=['FREE', 'PRO', 'ENTERPRISE'])
stripe_customer_id = models.CharField(unique=True)
# ... billing fields

Context Database Schema (PostgreSQL)

-- =============================================================================
-- TENANT-AWARE CONTEXT DATABASE SCHEMA
-- Extends ADR-CXQ-007 with organization-based isolation
-- =============================================================================

-- Extension for vector search
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm; -- Fuzzy text search

-- =============================================================================
-- CORE TABLES (Tenant-Scoped)
-- =============================================================================

-- Messages (session history)
CREATE TABLE cxq_messages (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
user_id UUID REFERENCES users(id),

-- Content
hash TEXT NOT NULL,
content TEXT NOT NULL,
role TEXT NOT NULL CHECK (role IN ('user', 'assistant', 'system')),

-- Source tracking
source_type TEXT,
source_file TEXT,
session_id TEXT,
checkpoint TEXT,

-- Timestamps
message_timestamp TIMESTAMPTZ,
extracted_at TIMESTAMPTZ DEFAULT NOW(),

-- Metadata
content_length INTEGER,
has_code BOOLEAN DEFAULT FALSE,
has_markdown BOOLEAN DEFAULT FALSE,
token_count INTEGER,

-- Uniqueness per tenant
UNIQUE(organization_id, hash)
);

CREATE INDEX idx_messages_org ON cxq_messages(organization_id);
CREATE INDEX idx_messages_session ON cxq_messages(organization_id, session_id);
CREATE INDEX idx_messages_role ON cxq_messages(organization_id, role);
CREATE INDEX idx_messages_ts ON cxq_messages(organization_id, message_timestamp);

-- Full-text search
CREATE INDEX idx_messages_fts ON cxq_messages
USING GIN (to_tsvector('english', content));

-- =============================================================================
-- KNOWLEDGE EXTRACTION (Tenant-Scoped)
-- =============================================================================

-- Decisions extracted from conversations
CREATE TABLE cxq_decisions (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
message_id BIGINT REFERENCES cxq_messages(id) ON DELETE CASCADE,

decision_text TEXT NOT NULL,
context TEXT,
category TEXT,
confidence FLOAT,

-- Sharing controls
is_shared BOOLEAN DEFAULT FALSE, -- Visible to org members
is_template BOOLEAN DEFAULT FALSE, -- Promoted to org-wide pattern

extracted_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_decisions_org ON cxq_decisions(organization_id);
CREATE INDEX idx_decisions_fts ON cxq_decisions
USING GIN (to_tsvector('english', decision_text || ' ' || COALESCE(context, '')));

-- Code patterns extracted from responses
CREATE TABLE cxq_code_patterns (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
message_id BIGINT REFERENCES cxq_messages(id) ON DELETE CASCADE,

language TEXT,
pattern_text TEXT NOT NULL,
pattern_hash TEXT NOT NULL,

-- Pattern metadata
pattern_type TEXT, -- function, class, config, etc.
quality_score FLOAT,
usage_count INTEGER DEFAULT 1,

-- Sharing
is_shared BOOLEAN DEFAULT FALSE,
is_template BOOLEAN DEFAULT FALSE,

extracted_at TIMESTAMPTZ DEFAULT NOW(),

UNIQUE(organization_id, pattern_hash)
);

CREATE INDEX idx_patterns_org ON cxq_code_patterns(organization_id);
CREATE INDEX idx_patterns_lang ON cxq_code_patterns(organization_id, language);

-- =============================================================================
-- CODEBASE INDEXING (Tenant-Scoped) - From ADR-CXQ-007
-- =============================================================================

-- Source files indexed per organization
CREATE TABLE cxq_source_files (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
workspace_id UUID, -- Optional: for multi-workspace orgs

file_hash TEXT NOT NULL,
relative_path TEXT NOT NULL,
absolute_path TEXT,

language TEXT NOT NULL,
file_type TEXT, -- source, test, config, script

line_count INTEGER DEFAULT 0,
byte_size INTEGER DEFAULT 0,
content_hash TEXT,
last_modified TIMESTAMPTZ,
indexed_at TIMESTAMPTZ DEFAULT NOW(),

-- Parsed metadata
has_imports BOOLEAN DEFAULT FALSE,
has_exports BOOLEAN DEFAULT FALSE,
has_classes BOOLEAN DEFAULT FALSE,
has_functions BOOLEAN DEFAULT FALSE,
import_count INTEGER DEFAULT 0,
function_count INTEGER DEFAULT 0,

UNIQUE(organization_id, workspace_id, relative_path)
);

CREATE INDEX idx_files_org ON cxq_source_files(organization_id);
CREATE INDEX idx_files_workspace ON cxq_source_files(organization_id, workspace_id);
CREATE INDEX idx_files_lang ON cxq_source_files(organization_id, language);

-- Symbols extracted from source files
CREATE TABLE cxq_symbols (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
file_id BIGINT NOT NULL REFERENCES cxq_source_files(id) ON DELETE CASCADE,

symbol_hash TEXT NOT NULL,
name TEXT NOT NULL,
qualified_name TEXT,
symbol_type TEXT NOT NULL, -- function, class, method, variable, etc.

-- Location
line_start INTEGER NOT NULL,
line_end INTEGER,
column_start INTEGER,
column_end INTEGER,

-- Metadata
visibility TEXT, -- public, private, protected
is_async BOOLEAN DEFAULT FALSE,
is_exported BOOLEAN DEFAULT FALSE,

-- Documentation
docstring TEXT,
signature TEXT,
return_type TEXT,
parameters JSONB,

-- Relationships
parent_symbol_id BIGINT REFERENCES cxq_symbols(id),

indexed_at TIMESTAMPTZ DEFAULT NOW(),

UNIQUE(organization_id, symbol_hash)
);

CREATE INDEX idx_symbols_org ON cxq_symbols(organization_id);
CREATE INDEX idx_symbols_file ON cxq_symbols(file_id);
CREATE INDEX idx_symbols_name ON cxq_symbols(organization_id, name);
CREATE INDEX idx_symbols_type ON cxq_symbols(organization_id, symbol_type);
CREATE INDEX idx_symbols_fts ON cxq_symbols
USING GIN (to_tsvector('english', name || ' ' || COALESCE(docstring, '')));

-- Cross-references (symbol usages)
CREATE TABLE cxq_cross_references (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),

symbol_id BIGINT NOT NULL REFERENCES cxq_symbols(id) ON DELETE CASCADE,
file_id BIGINT NOT NULL REFERENCES cxq_source_files(id) ON DELETE CASCADE,

line_number INTEGER NOT NULL,
column_number INTEGER,
ref_type TEXT NOT NULL, -- call, instantiation, read, write, type_annotation
context_line TEXT,

indexed_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_xref_org ON cxq_cross_references(organization_id);
CREATE INDEX idx_xref_symbol ON cxq_cross_references(symbol_id);
CREATE INDEX idx_xref_file ON cxq_cross_references(file_id);

-- =============================================================================
-- SEMANTIC EMBEDDINGS (Tenant-Scoped)
-- =============================================================================

CREATE TABLE cxq_embeddings (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),

source_type TEXT NOT NULL, -- message, decision, pattern, symbol, docstring
source_id BIGINT NOT NULL,

embedding vector(384) NOT NULL, -- all-MiniLM-L6-v2 dimensions
model TEXT NOT NULL DEFAULT 'all-MiniLM-L6-v2',

source_text TEXT, -- For regeneration

created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_emb_org ON cxq_embeddings(organization_id);
CREATE INDEX idx_emb_source ON cxq_embeddings(source_type, source_id);

-- Vector similarity index (HNSW for fast approximate search)
CREATE INDEX idx_emb_vector ON cxq_embeddings
USING hnsw (embedding vector_cosine_ops);

-- =============================================================================
-- WORKSPACE REGISTRY (Tenant-Scoped)
-- =============================================================================

CREATE TABLE cxq_workspaces (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
organization_id UUID NOT NULL REFERENCES organizations(id),

name TEXT NOT NULL,
slug TEXT NOT NULL,
root_path TEXT,

-- Git integration
git_url TEXT,
git_branch TEXT,
last_sync TIMESTAMPTZ,

-- Configuration
config JSONB DEFAULT '{}',

-- Stats
file_count INTEGER DEFAULT 0,
symbol_count INTEGER DEFAULT 0,
last_indexed TIMESTAMPTZ,

created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW(),

UNIQUE(organization_id, slug)
);

CREATE INDEX idx_workspace_org ON cxq_workspaces(organization_id);

-- =============================================================================
-- DOCUMENT INDEX (Tenant-Scoped)
-- =============================================================================

CREATE TABLE cxq_documents (
id BIGSERIAL PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
workspace_id UUID REFERENCES cxq_workspaces(id),

file_path TEXT NOT NULL,
file_hash TEXT NOT NULL,

title TEXT,
first_heading TEXT,
summary TEXT,

category TEXT,
subcategory TEXT,
tags TEXT[],

-- Frontmatter
frontmatter JSONB,

-- Content stats
word_count INTEGER,
line_count INTEGER,

-- Timestamps
file_modified TIMESTAMPTZ,
indexed_at TIMESTAMPTZ DEFAULT NOW(),

UNIQUE(organization_id, workspace_id, file_path)
);

CREATE INDEX idx_doc_org ON cxq_documents(organization_id);
CREATE INDEX idx_doc_workspace ON cxq_documents(organization_id, workspace_id);
CREATE INDEX idx_doc_category ON cxq_documents(organization_id, category);
CREATE INDEX idx_doc_fts ON cxq_documents
USING GIN (to_tsvector('english', title || ' ' || COALESCE(summary, '')));

Component Impact Analysis

1. Agents (119 Total)

Impact Level: HIGH

Agent CategoryCountMulti-Tenant Requirement
Research15Tenant-scoped memory queries
Development25Tenant-isolated code access
DevOps18Tenant infrastructure isolation
Security12Tenant compliance boundaries
Documentation14Tenant document catalogs
Orchestration8Tenant workflow isolation
Other27Context-dependent

Required Changes:

# Agent context injection
class TenantAwareAgent:
def __init__(self, organization_id: UUID, user_id: UUID):
self.org_id = organization_id
self.user_id = user_id
self.memory = TenantScopedMemory(organization_id)

def query_memory(self, query: str) -> List[Dict]:
# All queries automatically filtered by organization_id
return self.memory.search(query)

2. Commands (128 Total)

Impact Level: MEDIUM-HIGH

Command CategoryMulti-Tenant Impact
/cxq, /cxMust query tenant-scoped database
/git-syncMust sync to tenant repositories
/session-*Must isolate session data
/doc-*Must query tenant document catalog
/research-*Must use tenant memory
/deploy-*Must use tenant infrastructure

Required Changes:

# commands/cxq.md - Updated for multi-tenant

## Multi-Tenant Support

All /cxq commands are automatically scoped to the current organization.

### Cross-Organization Search (Enterprise Only)
/cxq --all-orgs "query" # Requires super-admin privileges
/cxq --org ORG_ID "query" # Query specific organization

### Workspace Scoping
/cxq --workspace NAME "query" # Specific workspace
/cxq --all-workspaces "query" # All org workspaces

3. Skills (79 Total)

Impact Level: MEDIUM

Skills are stateless patterns but need tenant context when:

  • Accessing memory system
  • Reading/writing files
  • Executing workflows
  • Generating reports

4. Scripts (195 Total)

Impact Level: HIGH

Script CategoryMulti-Tenant Requirement
context-db.pyMust support PostgreSQL + tenant filtering
session-*.pyMust isolate sessions by organization
export-*.pyMust scope exports to organization
backup-*.pyMust backup per-tenant data separately
sync-*.pyMust sync to tenant repositories

Required Changes:

# scripts/context-db.py - PostgreSQL variant

class MultiTenantContextDB:
def __init__(self, organization_id: UUID):
self.org_id = organization_id
self.conn = get_postgresql_connection()

def search(self, query: str) -> List[Dict]:
# Automatic tenant filtering
return self.conn.execute("""
SELECT * FROM cxq_messages
WHERE organization_id = %s
AND to_tsvector('english', content) @@ plainto_tsquery(%s)
ORDER BY message_timestamp DESC
""", [self.org_id, query])

5. Workflows (1,149 Total)

Impact Level: HIGH

All N8N-compatible workflows need tenant context:

# workflows/research/competitive-analysis.yaml
# UPDATED for multi-tenant

input:
organization_id: "{{org_id}}"
user_id: "{{user_id}}"
workspace_id: "{{workspace_id}}"

steps:
- agent: web-search-researcher
context:
organization_id: "{{organization_id}}"
memory_scope: organization # or 'user', 'workspace', 'global'

Distributed Intelligence Impact

repo/.coditect → submodules/core/coditect-core
repo/.claude → .coditect

Future: Tenant-Aware Intelligence

repo/.coditect → submodules/core/coditect-core

├── .coditect/tenant-config.json
│ {
│ "organization_id": "uuid",
│ "workspace_id": "uuid",
│ "sync_enabled": true,
│ "memory_mode": "hybrid" // local + cloud
│ }

└── .coditect/local-cache/
├── context.db (SQLite - offline cache)
└── sync-state.json (last sync timestamp)

Customer Project Isolation Model

┌─────────────────────────────────────────────────────────────────────┐
│ CUSTOMER PROJECT ISOLATION │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ORGANIZATION: Acme Corp (tenant_id: abc-123) │
│ ├── WORKSPACE: Main Product │
│ │ ├── Codebase: 15,000 files │
│ │ ├── Symbols: 500,000 │
│ │ ├── Memory: 50,000 messages │
│ │ └── Patterns: 10,000 code patterns │
│ │ │
│ ├── WORKSPACE: Mobile App │
│ │ ├── Codebase: 8,000 files │
│ │ └── Isolated from Main Product │
│ │ │
│ └── WORKSPACE: Infrastructure │
│ └── Shared patterns with Main Product (cross-workspace) │
│ │
│ ───────────────────────────────────────────────────────────────── │
│ │
│ ORGANIZATION: Beta Inc (tenant_id: def-456) │
│ ├── COMPLETE ISOLATION from Acme Corp │
│ ├── Cannot see Acme's code, patterns, or memory │
│ └── Own workspaces, own data, own encryption keys │
│ │
└─────────────────────────────────────────────────────────────────────┘

Implementation Phases

Phase 1: Database Foundation (Weeks 1-2)

  • Create PostgreSQL schema in coditect-cloud-backend
  • Add cxq_* tables with organization_id
  • Implement django-multitenant integration
  • Create migration scripts from SQLite → PostgreSQL
  • Add pgvector extension for embeddings

Phase 2: API Layer (Weeks 3-4)

  • Create /api/v1/cxq/* endpoints
  • Implement tenant-scoped search
  • Add workspace management endpoints
  • Create sync API for local ↔ cloud

Phase 3: Local-Cloud Sync (Weeks 5-6)

  • Implement SQLite → PostgreSQL sync client
  • Add offline-first conflict resolution
  • Create background sync daemon
  • Add sync status indicators

Phase 4: Component Updates (Weeks 7-8)

  • Update context-db.py for dual-mode (SQLite/PostgreSQL)
  • Update all /cxq commands for tenant awareness
  • Update agent memory access patterns
  • Update workflow execution context

Phase 5: Testing & Migration (Weeks 9-10)

  • Multi-tenant isolation testing
  • Performance benchmarking
  • Security audit
  • Beta customer migration

Security Considerations

Data Isolation Requirements

LevelMechanismEnforcement
Organizationorganization_id FKDatabase-level, django-multitenant
Workspaceworkspace_id FKApplication-level filtering
Useruser_id FKRow-level security
EncryptionPer-tenant keysCloud KMS integration

Compliance

  • SOC 2 Type II - Audit logging, access controls
  • GDPR - Data residency, right to deletion
  • HIPAA - Encryption at rest, access logging (enterprise tier)

Success Metrics

MetricTargetMeasurement
Tenant Isolation100%No cross-tenant data leakage
Query Latency P95< 100msPostgreSQL monitoring
Sync Latency< 5sLocal → cloud sync time
Offline Capability100%Full function without network
Migration Success99%+Data integrity verification

Decision

Primary Decision: Implement Hybrid Architecture

Rationale:

  1. Local-first - SQLite for offline development (essential for developers)
  2. Cloud-native - PostgreSQL for multi-tenant SaaS platform
  3. Sync layer - Bidirectional sync with conflict resolution
  4. Existing infrastructure - Leverages coditect-cloud-backend PostgreSQL + FoundationDB

Technology Stack

LayerTechnologyPurpose
Local StorageSQLite + FTS5Offline-first, single-user
Cloud PrimaryPostgreSQL + pgvectorMulti-tenant, vector search
Distributed StateFoundationDBReal-time sync, tenant isolation
CachingRedisHot queries, session tracking
FutureNeo4jKnowledge graphs (enterprise)

Trigger Points for Review

  • First enterprise customer onboarding
  • 100+ concurrent users
  • Cross-organization collaboration requirements
  • Knowledge graph demand validation

References


Decision Status: PROPOSED Review Required By: Engineering Lead, Security Lead Target Implementation: Q1 2026