Skip to main content

ADR-119: Multi-Tenant Local Database Schema

Status

ACCEPTED - January 26, 2026

Context

ADR-118 established the four-tier database architecture, separating:

  • Tier 2: org.db (irreplaceable organization data)
  • Tier 3: sessions.db (regenerable session data)

However, the current local schema lacks multi-tenant hierarchy fields:

-- Current org.db schema (MISSING multi-tenant fields)
CREATE TABLE skill_learnings (
id INTEGER PRIMARY KEY,
session_id TEXT NOT NULL,
skill_name TEXT NOT NULL,
-- NO tenant_id, user_id, team_id, project_id
);

CREATE TABLE decisions (
id INTEGER PRIMARY KEY,
project_path TEXT, -- Only has project_path, not project_id
decision_type TEXT NOT NULL,
-- NO tenant_id, user_id, team_id
);

The Django cloud models have proper hierarchy:

  • Tenant (organization) - via django-multitenant
  • Organization - foreign key relationships
  • User - user ownership
  • Project - project scoping

Critical Gap: Local data cannot sync to cloud without matching hierarchy.

Decision

1. Add Multi-Tenant Columns to Local Schema

All Tier 2 (org.db) and Tier 3 (sessions.db) tables must include:

-- REQUIRED columns for all user-scoped tables
tenant_id TEXT NOT NULL, -- Organization UUID (matches Django Tenant)
user_id TEXT NOT NULL, -- User UUID (matches Django User)

-- OPTIONAL columns for finer scoping
team_id TEXT, -- Team UUID (for team-level isolation)
project_id TEXT, -- Project UUID (for project-scoped data)

2. Updated org.db Schema

-- org.db Schema (ADR-119 Multi-Tenant)
CREATE TABLE skill_learnings (
id INTEGER PRIMARY KEY AUTOINCREMENT,

-- Multi-tenant hierarchy (REQUIRED)
tenant_id TEXT NOT NULL, -- Organization UUID
user_id TEXT NOT NULL, -- User UUID
team_id TEXT, -- Optional team scope
project_id TEXT, -- Optional project scope

-- Existing fields
session_id TEXT NOT NULL,
skill_name TEXT NOT NULL,
outcome TEXT,
effectiveness_score INTEGER,
errors TEXT,
analyzed_at TEXT NOT NULL,

-- Cloud sync metadata
cloud_id TEXT UNIQUE, -- Cloud record UUID
synced_at TEXT, -- Last sync timestamp
sync_status TEXT DEFAULT 'pending', -- pending, synced, conflict

UNIQUE(session_id, skill_name, tenant_id)
);

CREATE INDEX idx_skill_learnings_tenant ON skill_learnings(tenant_id);
CREATE INDEX idx_skill_learnings_user ON skill_learnings(tenant_id, user_id);
CREATE INDEX idx_skill_learnings_project ON skill_learnings(tenant_id, project_id);
CREATE INDEX idx_skill_learnings_sync ON skill_learnings(sync_status, synced_at);

CREATE TABLE decisions (
id INTEGER PRIMARY KEY AUTOINCREMENT,

-- Multi-tenant hierarchy (REQUIRED)
tenant_id TEXT NOT NULL,
user_id TEXT NOT NULL,
team_id TEXT,
project_id TEXT,

-- Existing fields
message_id INTEGER,
project_path TEXT, -- Retained for backward compatibility
decision_type TEXT NOT NULL,
decision TEXT NOT NULL,
rationale TEXT,
alternatives_considered TEXT,
confidence REAL DEFAULT 0.5,
tags TEXT,
created_at TEXT DEFAULT CURRENT_TIMESTAMP,

-- Cloud sync metadata
cloud_id TEXT UNIQUE,
synced_at TEXT,
sync_status TEXT DEFAULT 'pending',

FOREIGN KEY (message_id) REFERENCES messages(id) ON DELETE SET NULL
);

CREATE INDEX idx_decisions_tenant ON decisions(tenant_id);
CREATE INDEX idx_decisions_project ON decisions(tenant_id, project_id);
CREATE INDEX idx_decisions_sync ON decisions(sync_status, synced_at);

CREATE TABLE error_solutions (
id INTEGER PRIMARY KEY AUTOINCREMENT,

-- Multi-tenant hierarchy
tenant_id TEXT NOT NULL,
user_id TEXT NOT NULL,
project_id TEXT,

-- Existing fields
error_hash TEXT UNIQUE,
error_type TEXT NOT NULL,
error_signature TEXT NOT NULL,
error_context TEXT,
solution TEXT NOT NULL,
solution_code TEXT,
language TEXT,
success_count INTEGER DEFAULT 1,
failure_count INTEGER DEFAULT 0,
last_used TEXT DEFAULT CURRENT_TIMESTAMP,
created_at TEXT DEFAULT CURRENT_TIMESTAMP,

-- Cloud sync metadata
cloud_id TEXT UNIQUE,
synced_at TEXT,
sync_status TEXT DEFAULT 'pending'
);

CREATE INDEX idx_error_solutions_tenant ON error_solutions(tenant_id);

3. Cloud-First Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│ CODITECT CLOUD-FIRST ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ CLOUD (Source of Truth) │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ PostgreSQL + django-multitenant │ │
│ │ │ │
│ │ Tenant (Organization) │ │
│ │ └── Team (optional) │ │
│ │ └── User │ │
│ │ └── Project │ │
│ │ └── skill_learnings, decisions, error_solutions │ │
│ │ │ │
│ │ API: POST /api/v1/context/sync │ │
│ │ GET /api/v1/context/pull?since=<cursor> │ │
│ │ POST /api/v1/context/push (batch upsert) │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Near-real-time replication │
│ │ (WebSocket or long-polling) │
│ ▼ │
│ LOCAL (Cache + Offline) │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ SQLite (org.db, sessions.db) │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
│ │ │ cloud_sync_state │ │ │
│ │ │ ├── last_sync_cursor -- Cursor for incremental pull │ │ │
│ │ │ ├── last_push_at -- Last successful push │ │ │
│ │ │ ├── pending_push_count -- Records awaiting sync │ │ │
│ │ │ └── conflict_count -- Conflicts requiring resolution │ │ │
│ │ └─────────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Sync binary: codi-cloud-sync (Rust) │ │
│ │ - Location: ~/.coditect/bin/codi-cloud-sync │ │
│ │ - Push local changes every 30s (or on write) │ │
│ │ - Pull cloud changes every 60s (or via WebSocket) │ │
│ │ - Conflict resolution: cloud wins (with local backup) │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

4. Identity Resolution

On session start, local client resolves identity:

# Local identity resolution
def resolve_identity():
"""Get tenant_id, user_id from cloud or license key."""

# 1. Check cached identity
if identity := get_cached_identity():
return identity

# 2. Authenticate with cloud
token = authenticate_with_cloud() # OAuth or license key

# 3. Extract identity from JWT
identity = {
'tenant_id': token['tenant_id'],
'user_id': token['user_id'],
'team_id': token.get('team_id'),
'default_project_id': token.get('default_project_id'),
}

# 4. Cache for offline use
cache_identity(identity)

return identity

5. Sync Protocol

# Push local changes to cloud
def push_to_cloud():
"""Push pending local changes to cloud."""

pending = db.execute("""
SELECT * FROM skill_learnings
WHERE sync_status = 'pending'
ORDER BY analyzed_at
LIMIT 100
""")

response = cloud_api.post('/api/v1/context/push', {
'records': [serialize(r) for r in pending],
'table': 'skill_learnings',
})

for record in response['synced']:
db.execute("""
UPDATE skill_learnings
SET cloud_id = ?, synced_at = ?, sync_status = 'synced'
WHERE id = ?
""", (record['cloud_id'], now(), record['local_id']))

# Pull cloud changes to local
def pull_from_cloud():
"""Pull new/updated records from cloud."""

cursor = get_sync_cursor()

response = cloud_api.get(f'/api/v1/context/pull?since={cursor}')

for record in response['records']:
# Upsert by cloud_id
db.execute("""
INSERT OR REPLACE INTO skill_learnings
(cloud_id, tenant_id, user_id, ..., sync_status, synced_at)
VALUES (?, ?, ?, ..., 'synced', ?)
""", (...))

save_sync_cursor(response['next_cursor'])

Migration

Phase 1: Schema Migration

# Add columns to existing tables
ALTER TABLE skill_learnings ADD COLUMN tenant_id TEXT;
ALTER TABLE skill_learnings ADD COLUMN user_id TEXT;
ALTER TABLE skill_learnings ADD COLUMN team_id TEXT;
ALTER TABLE skill_learnings ADD COLUMN project_id TEXT;
ALTER TABLE skill_learnings ADD COLUMN cloud_id TEXT;
ALTER TABLE skill_learnings ADD COLUMN synced_at TEXT;
ALTER TABLE skill_learnings ADD COLUMN sync_status TEXT DEFAULT 'pending';

-- Set default tenant/user from machine identity
UPDATE skill_learnings
SET tenant_id = (SELECT value FROM config WHERE key = 'tenant_id'),
user_id = (SELECT value FROM config WHERE key = 'user_id'),
sync_status = 'pending'
WHERE tenant_id IS NULL;

Phase 2: Backfill from Cloud

For users who already have cloud data:

python3 scripts/cloud-sync.py --backfill --from-cloud

Phase 3: Enable Real-time Sync

# Start sync daemon
codi-cloud-sync daemon --interval 30

# Or individual operations
codi-cloud-sync push --limit 100
codi-cloud-sync pull --limit 100
codi-cloud-sync status
codi-cloud-sync verify

Consequences

Positive

  1. Cloud-first: Cloud is single source of truth
  2. Offline capable: Local cache works when disconnected
  3. Multi-tenant: Proper isolation by tenant/user/team/project
  4. Syncable: Data can sync between devices and cloud
  5. Auditable: Cloud has complete history for compliance

Negative

  1. Complexity: Sync logic adds complexity
  2. Latency: Initial sync may be slow for large datasets
  3. Conflicts: Need conflict resolution strategy
  4. Identity required: Must authenticate to get tenant_id

Mitigations

  1. Incremental sync: Use cursors, batch operations
  2. Conflict resolution: Cloud wins, local backup preserved
  3. Offline grace period: 72 hours of offline operation
  4. Background sync: Non-blocking, async operations

Implementation Tasks

  1. Add multi-tenant columns to org.db schema
  2. Add multi-tenant columns to sessions.db schema
  3. Create identity resolution module (tools/cloud-sync/src/identity.rs)
  4. Create cloud sync binary (codi-cloud-sync - Rust)
  5. Add sync status tracking table
  6. Implement push/pull API endpoints (Django)
  7. Create conflict resolution logic
  8. Write migration script for existing data (scripts/migrate-to-four-tier-db.py)
  9. Add launchd service for sync daemon

Sync Binary Implementation

The codi-cloud-sync Rust binary implements the local sync engine:

Location: tools/cloud-sync/

Structure:

tools/cloud-sync/
├── Cargo.toml
├── src/
│ ├── main.rs # CLI entry point (clap)
│ ├── config.rs # Configuration loading
│ ├── database.rs # SQLite operations
│ ├── identity.rs # License key → JWT authentication
│ └── sync.rs # Sync engine (push/pull/daemon)

Authentication Flow (ADR-011, ADR-003 Addendum):

License Key → POST /api/v1/auth/license → JWT (cached)

tenant_id, user_id, team_id claims

Bearer token for sync API

Commands:

CommandDescription
codi-cloud-sync daemonRun background sync daemon
codi-cloud-sync pushPush pending local changes
codi-cloud-sync pullPull cloud changes
codi-cloud-sync statusShow sync status
codi-cloud-sync authAuthenticate with license key
codi-cloud-sync verifyVerify sync state

Related ADRs: