CODITECT Database Schema Reference

Overview

CODITECT uses a four-tier database architecture (ADR-118) with multi-tenant support (ADR-119):

┌─────────────────────────────────────────────────────────────────┐
│                    FOUR-TIER ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────┤
│  TIER 1: platform.db (Read-Only, Regenerable)                   │
│  └── Component registry: agents, skills, commands, scripts      │
│  └── Location: ~/.coditect/framework-data/                      │
├─────────────────────────────────────────────────────────────────┤
│  TIER 2: org.db (CRITICAL - Daily Backup Required)              │
│  └── IRREPLACEABLE: skill_learnings, decisions, error_solutions │
│  └── Location: ~/PROJECTS/.coditect-data/context-storage/       │
├─────────────────────────────────────────────────────────────────┤
│  TIER 3: sessions.db (Regenerable from JSONL)                   │
│  └── Messages, tool_analytics, token_economics, code_patterns   │
│  └── Location: ~/PROJECTS/.coditect-data/context-storage/       │
├─────────────────────────────────────────────────────────────────┤
│  TIER 4: project.db (Project-Specific)                          │
│  └── Project-local data, work items, sprint tracking            │
│  └── Location: <project>/.coditect-data/                        │
└─────────────────────────────────────────────────────────────────┘

Quick Reference

Tier	Database	Backup	Size	Tables	Primary Use
1	platform.db	NOT REQUIRED	101 MB	15	Component metadata
2	org.db	DAILY	416 MB	28	Critical knowledge + knowledge graph
3	sessions.db	Optional	38 GB	96	Session data + context graphs
4	projects.db	With code	9.6 MB	12	Project-specific
-	messaging.db	Optional	1.2 MB	9	Inter-session coordination (ADR-160)
-	call_graph.db	NOT REQUIRED	0 B	0	Code call graph (MCP server)

Database Locations

# Tier 1: Platform (framework installation)
~/.coditect/framework-data/platform.db

# Tier 2 & 3: User data
~/PROJECTS/.coditect-data/context-storage/org.db      # CRITICAL
~/PROJECTS/.coditect-data/context-storage/sessions.db # Regenerable

# Tier 4: Project-specific
<project>/.coditect-data/project.db

# Inter-session coordination (ADR-160)
~/PROJECTS/.coditect-data/context-storage/messaging.db  # Session bus

# Code analysis (MCP server)
~/PROJECTS/.coditect-data/context-storage/call_graph.db  # Call graph index

# DEPRECATED (do not use in new code)
~/PROJECTS/.coditect-data/context-storage/context.db  # → migrate to org.db + sessions.db

Tier 2: org.db (CRITICAL)

⚠️ BACKUP DAILY - Contains irreplaceable organizational knowledge

Tables

Table	Rows	Description	Status
skill_learnings	758K+	AI learning patterns	IRREPLACEABLE
decisions	1.8K+	Architectural decisions	IRREPLACEABLE
error_solutions	475+	Error-solution pairs	IRREPLACEABLE
kg_nodes	16.9K+	Knowledge graph nodes (ADR-151)	IRREPLACEABLE
kg_edges	7.1K+	Knowledge graph edges (ADR-151)	IRREPLACEABLE
kg_nodes_fts	-	FTS5 index for kg_nodes	Auto-synced
cloud_sync_state	-	Cloud sync tracking	Operational
local_identity	-	Tenant/user identity	Operational

Key Schemas

-- skill_learnings: Accumulated AI patterns
CREATE TABLE skill_learnings (
    id INTEGER PRIMARY KEY,
    session_id TEXT NOT NULL,
    skill_name TEXT NOT NULL,
    outcome TEXT,
    effectiveness_score INTEGER,
    errors TEXT,
    analyzed_at TEXT NOT NULL,
    -- Multi-tenant (ADR-119)
    tenant_id TEXT, user_id TEXT, team_id TEXT, project_id TEXT,
    cloud_id TEXT, synced_at TEXT, sync_status TEXT DEFAULT 'pending',
    UNIQUE(session_id, skill_name)
);

-- decisions: Architectural decisions
CREATE TABLE decisions (
    id INTEGER PRIMARY KEY,
    message_id INTEGER,
    project_path TEXT,
    decision_type TEXT NOT NULL,
    decision TEXT NOT NULL,
    rationale TEXT,
    alternatives_considered TEXT,
    confidence REAL DEFAULT 0.5,
    tags TEXT,
    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
    -- Multi-tenant (ADR-119)
    tenant_id TEXT, user_id TEXT, team_id TEXT, project_id TEXT,
    cloud_id TEXT, synced_at TEXT, sync_status TEXT DEFAULT 'pending'
);

-- error_solutions: Known fixes
CREATE TABLE error_solutions (
    id INTEGER PRIMARY KEY,
    error_hash TEXT UNIQUE,
    error_type TEXT NOT NULL,
    error_signature TEXT NOT NULL,
    solution TEXT NOT NULL,
    success_count INTEGER DEFAULT 1,
    failure_count INTEGER DEFAULT 0,
    -- Multi-tenant (ADR-119)
    tenant_id TEXT, user_id TEXT, team_id TEXT, project_id TEXT,
    cloud_id TEXT, synced_at TEXT, sync_status TEXT DEFAULT 'pending'
);

-- kg_nodes: Knowledge graph entities (ADR-151)
CREATE TABLE kg_nodes (
    id TEXT PRIMARY KEY,                      -- "{type}:{uuid}" e.g., "decision:42"
    node_type TEXT NOT NULL,                  -- component, session, decision, error_solution,
                                              -- skill_learning, file, function, track, adr
    subtype TEXT,                             -- component: agent/skill/command/script/hook
    name TEXT NOT NULL,                       -- Display name
    properties TEXT,                          -- JSON properties bag (type-specific)
    embedding BLOB,                           -- 1536-dim float32 vector
    tenant_id TEXT, project_id TEXT,
    created_at TEXT NOT NULL, updated_at TEXT NOT NULL,
    source_table TEXT, source_id TEXT         -- Provenance tracking
);
-- Indexes: node_type, subtype, tenant, project, source, updated_at
-- FTS5: kg_nodes_fts(name, properties)

-- kg_edges: Knowledge graph relationships (ADR-151)
CREATE TABLE kg_edges (
    id TEXT PRIMARY KEY,
    edge_type TEXT NOT NULL,                  -- INVOKES, PRODUCES, SOLVES, BELONGS_TO,
                                              -- DEFINES, SIMILAR_TO, REFERENCES, CALLS,
                                              -- USES, GOVERNED_BY, CREATED_BY
    from_node TEXT NOT NULL,                  -- Source kg_nodes.id
    to_node TEXT NOT NULL,                    -- Target kg_nodes.id
    properties TEXT,                          -- JSON: weight, context, metadata
    tenant_id TEXT,
    created_at TEXT NOT NULL,
    UNIQUE(from_node, to_node, edge_type),
    FOREIGN KEY (from_node) REFERENCES kg_nodes(id) ON DELETE CASCADE,
    FOREIGN KEY (to_node) REFERENCES kg_nodes(id) ON DELETE CASCADE
);
-- Indexes: edge_type, from_node, to_node, tenant, (from_node, to_node)

Knowledge Graph Statistics (ADR-151)

Node types (16,932 total):

Type	Count	Source
function	7,066	Call graph analysis
component	5,411	Component indexer
decision	1,856	/cx extraction
skill_learning	1,029	Session retrospectives
file	577	Project indexer
error_solution	475	/cx extraction
session	449	Session tracking
track	37	PILOT tracks A-AK
adr	32	ADR documents

Edge types (7,118 total):

Type	Count	Meaning
INVOKES	3,363	Component invokes another
CALLS	2,565	Function calls function
USES	493	Entity uses another
SOLVES	475	Error solution resolves error
BELONGS_TO	220	Membership relationship
DEFINES	1	Definition relationship
REFERENCES	1	Reference relationship

Tier 3: sessions.db (Regenerable)

Regenerable from unified_messages.jsonl - backup optional

Tables

Table	Rows	Description
messages	251K+	Session messages
tool_analytics	8.2M+	Tool usage analytics
token_economics	17.8M+	Token usage and costs
session_insights	5.7K+	Session-level insights
code_patterns	10K+	Extracted code patterns
embeddings	143K+	Vector embeddings
context_graphs	4+	Stored context graph projections (ADR-151)
context_graph_nodes	56+	Nodes included in context graphs
context_graph_usage	-	Graph usage tracking and feedback

Key Schemas

-- messages: Session content
CREATE TABLE messages (
    id INTEGER PRIMARY KEY,
    hash TEXT UNIQUE NOT NULL,  -- SHA256 for dedup
    content TEXT NOT NULL,
    role TEXT NOT NULL,         -- user, assistant, system
    session_id TEXT,
    timestamp TEXT,
    has_code BOOLEAN,
    tool_use TEXT,              -- JSON tool calls
    tool_result TEXT,           -- JSON tool results
    -- Multi-tenant (ADR-119)
    tenant_id TEXT, user_id TEXT, team_id TEXT, project_id TEXT,
    cloud_id TEXT, synced_at TEXT, sync_status TEXT DEFAULT 'pending'
);

-- session_log_entries: Indexed session log entries (J.32)
CREATE TABLE session_log_entries (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    hash TEXT UNIQUE NOT NULL,           -- SHA256(source_file:entry_line_number)
    source_file TEXT NOT NULL,           -- Full path to SESSION-LOG-YYYY-MM-DD.md
    source_file_mtime REAL,              -- File mtime for change detection
    project_id TEXT,                     -- From directory path (e.g., "PILOT")
    machine_uuid TEXT,                   -- From directory path
    log_date TEXT NOT NULL,              -- YYYY-MM-DD from filename
    entry_timestamp TEXT,                -- ISO 8601 from heading (nullable)
    task_ids TEXT,                       -- Comma-separated (e.g., "H.13.9,H.0")
    author TEXT,                         -- e.g., "Claude (Opus 4.6)"
    heading TEXT NOT NULL,               -- Entry heading text
    content TEXT NOT NULL,               -- Full markdown content
    content_length INTEGER DEFAULT 0,
    files_modified TEXT,                 -- JSON array of files mentioned
    has_tables INTEGER DEFAULT 0,
    has_code INTEGER DEFAULT 0,
    session_uuid TEXT,                   -- From YAML frontmatter sessions[].id
    extracted_at TEXT NOT NULL
);

CREATE INDEX idx_sle_project ON session_log_entries(project_id);
CREATE INDEX idx_sle_date ON session_log_entries(log_date);
CREATE INDEX idx_sle_task ON session_log_entries(task_ids);
CREATE INDEX idx_sle_source ON session_log_entries(source_file);

-- tool_analytics: Tool execution tracking
CREATE TABLE tool_analytics (
    id INTEGER PRIMARY KEY,
    session_id TEXT NOT NULL,
    tool_name TEXT NOT NULL,
    status TEXT NOT NULL,       -- success, failed, timeout
    execution_time_ms INTEGER,
    error_message TEXT,
    -- Multi-tenant (ADR-119)
    tenant_id TEXT, user_id TEXT, team_id TEXT, project_id TEXT
);

-- token_economics: Cost tracking
CREATE TABLE token_economics (
    id INTEGER PRIMARY KEY,
    session_id TEXT NOT NULL,
    model_name TEXT NOT NULL,   -- opus, sonnet, haiku
    token_input INTEGER DEFAULT 0,
    token_output INTEGER DEFAULT 0,
    cost_total_usd REAL,
    -- Multi-tenant (ADR-119)
    tenant_id TEXT, user_id TEXT, team_id TEXT, project_id TEXT
);

-- context_graphs: Stored graph projections (ADR-151)
CREATE TABLE context_graphs (
    id TEXT PRIMARY KEY,                      -- "cg:{timestamp}:{hash}"
    name TEXT,                                -- Human-readable name
    task_description TEXT,                    -- Task/query that prompted this graph
    seed_nodes TEXT,                          -- JSON array of seed node IDs
    seed_strategy TEXT DEFAULT 'anchor',      -- 'anchor', 'semantic', 'policy_first'
    token_budget INTEGER DEFAULT 4000,
    max_depth INTEGER DEFAULT 3,
    max_nodes INTEGER DEFAULT 128,
    relevance_threshold REAL DEFAULT 0.5,
    node_count INTEGER DEFAULT 0,
    edge_count INTEGER DEFAULT 0,
    tokens_estimated INTEGER DEFAULT 0,
    tenant_id TEXT, project_id TEXT, session_id TEXT,
    created_at TEXT NOT NULL, expires_at TEXT,
    build_time_ms INTEGER, builder_version TEXT DEFAULT '1.0.0',
    policies_applied TEXT,  -- J.25.4.4: JSON array of governance policies applied
    phi_node_count INTEGER DEFAULT 0  -- J.25.4.3: Count of nodes with potential PHI indicators
);

-- context_graph_nodes: Nodes included in a context graph
CREATE TABLE context_graph_nodes (
    context_graph_id TEXT NOT NULL,           -- References context_graphs.id
    node_id TEXT NOT NULL,                    -- References kg_nodes.id (in org.db)
    relevance_score REAL DEFAULT 1.0,         -- 0.0-1.0 relevance to task
    depth INTEGER DEFAULT 0,                  -- BFS depth from seed
    is_seed INTEGER DEFAULT 0,
    include_properties INTEGER DEFAULT 1,
    token_estimate INTEGER DEFAULT 0,
    PRIMARY KEY (context_graph_id, node_id),
    FOREIGN KEY (context_graph_id) REFERENCES context_graphs(id) ON DELETE CASCADE
);

-- context_graph_usage: Usage tracking and feedback loop
CREATE TABLE context_graph_usage (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    context_graph_id TEXT NOT NULL,
    session_id TEXT, message_id TEXT, agent_id TEXT,
    tokens_used INTEGER, retrieval_time_ms INTEGER,
    was_helpful INTEGER,                      -- 1=helpful, 0=not, NULL=unknown
    feedback_text TEXT,
    used_at TEXT NOT NULL,
    FOREIGN KEY (context_graph_id) REFERENCES context_graphs(id) ON DELETE CASCADE
);

Tier 1: platform.db (Read-Only)

Component registry - regenerated by component-indexer.py

Tables

Table	Description
components	Framework components (agents, skills, commands)
capabilities	Component capabilities
component_metadata	Extended metadata
component_frontmatter	YAML frontmatter

Multi-Tenant Columns (ADR-119)

All user data tables include:

-- Identity columns
tenant_id    TEXT NOT NULL,      -- Organization UUID
user_id      TEXT NOT NULL,      -- User UUID
team_id      TEXT,               -- Team scope (optional)
project_id   TEXT,               -- Project scope (optional)

-- Cloud sync columns
cloud_id     TEXT UNIQUE,        -- Cloud record UUID
synced_at    TEXT,               -- Last sync (RFC3339)
sync_status  TEXT DEFAULT 'pending'  -- pending | synced | conflict

Standard Indexes

CREATE INDEX idx_{table}_tenant ON {table}(tenant_id);
CREATE INDEX idx_{table}_user ON {table}(tenant_id, user_id);
CREATE INDEX idx_{table}_project ON {table}(tenant_id, project_id);
CREATE INDEX idx_{table}_sync ON {table}(sync_status, synced_at);
CREATE INDEX idx_{table}_cloud_id ON {table}(cloud_id);

Full-Text Search (FTS5)

FTS Table	Base Table	Database	Indexed Columns
messages_fts	messages	sessions.db	content, role
decisions_fts	decisions	org.db	decision, rationale
errors_fts	error_solutions	org.db	error_signature, solution
session_log_fts	session_log_entries	sessions.db	heading, content, task_ids
kg_nodes_fts	kg_nodes	org.db	name, properties
doc_search	doc_index	sessions.db	title, summary, content_preview
component_search	components	platform.db	name, description

Query Examples

-- Search messages (Tier 3)
SELECT * FROM messages
WHERE id IN (SELECT rowid FROM messages_fts WHERE messages_fts MATCH 'authentication');

-- Search decisions (Tier 2)
SELECT * FROM decisions
WHERE id IN (SELECT rowid FROM decisions_fts WHERE decisions_fts MATCH 'architecture');

Python API

from scripts.core.paths import get_org_db_path, get_sessions_db_path

# Tier 2: Critical data
ORG_DB = get_org_db_path()           # ~/PROJECTS/.coditect-data/context-storage/org.db

# Tier 3: Session data
SESSIONS_DB = get_sessions_db_path()  # ~/PROJECTS/.coditect-data/context-storage/sessions.db

# Connect
import sqlite3
conn = sqlite3.connect(ORG_DB)
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM decisions")

CLI Access

# Tier 2: Decisions
sqlite3 ~/PROJECTS/.coditect-data/context-storage/org.db "SELECT COUNT(*) FROM decisions"

# Tier 3: Messages
sqlite3 ~/PROJECTS/.coditect-data/context-storage/sessions.db "SELECT COUNT(*) FROM messages"

# Via /cxq command (auto-routes to correct DB)
/cxq --decisions --limit 10
/cxq --recent 50
/cxq --stats

Migration

From context.db to Four-Tier

# 1. Dry run
python3 scripts/migrate-to-four-tier-db.py --dry-run

# 2. Execute
python3 scripts/migrate-to-four-tier-db.py

# 3. Verify
python3 scripts/migrate-to-four-tier-db.py --verify

Add Multi-Tenant Columns

python3 scripts/migrate-multi-tenant-schema.py --dry-run
python3 scripts/migrate-multi-tenant-schema.py

Backup

# Backup critical org.db to GCS (REQUIRED - daily)
~/.coditect/scripts/backup-context-db.sh

# Restore
~/.coditect/scripts/backup-context-db.sh --restore latest

Bucket: gs://coditect-cloud-infra-context-backups

Document	Description
four-tier-database-schema.md	Detailed table definitions
ADR-118	Four-tier architecture decision
ADR-119	Multi-tenant schema decision
ADR-151	Knowledge graph architecture
CONTEXT-GRAPH-GUIDE.md	Context graph usage guide
MEMORY-MANAGEMENT-GUIDE.md	/cx, /cxq usage

Statistics (as of 2026-02-19)

Database	Table	Rows	Status
org.db	skill_learnings	758,358	IRREPLACEABLE
org.db	decisions	1,856	IRREPLACEABLE
org.db	error_solutions	475	IRREPLACEABLE
org.db	kg_nodes	18,622	IRREPLACEABLE
org.db	kg_edges	12,023	IRREPLACEABLE
org.db	file_integrity_audit	388,274	Operational
sessions.db	messages	718,721	Regenerable
sessions.db	tool_analytics	53,394,316	Regenerable
sessions.db	token_economics	73,733,714	Regenerable
sessions.db	embeddings	650,425	Regenerable
sessions.db	session_log_entries	3,765	Regenerable
sessions.db	context_graphs	4	Regenerable
sessions.db	context_graph_nodes	56	Regenerable
platform.db	components	10,405	Regenerable
platform.db	capabilities	242,741	Regenerable
platform.db	triggers	82,587	Regenerable
projects.db	projects	18	Project-specific
projects.db	content_hashes	23,411	Regenerable
messaging.db	session_registry	48	Operational
messaging.db	session_messages	423	Operational
messaging.db	task_claims	2	Operational

Last Updated: 2026-02-19 Version: 2.3.0

Overview​

Quick Reference​

Database Locations​

Tier 2: org.db (CRITICAL)​

Tables​

Key Schemas​

Knowledge Graph Statistics (ADR-151)​

Tier 3: sessions.db (Regenerable)​

Tables​

Key Schemas​

Tier 1: platform.db (Read-Only)​

Tables​

Multi-Tenant Columns (ADR-119)​

Standard Indexes​

Full-Text Search (FTS5)​

Query Examples​

Python API​

CLI Access​

Migration​

From context.db to Four-Tier​

Add Multi-Tenant Columns​

Backup​

Related Documentation​

Statistics (as of 2026-02-19)​

Overview

Quick Reference

Database Locations

Tier 2: org.db (CRITICAL)

Tables

Key Schemas

Knowledge Graph Statistics (ADR-151)

Tier 3: sessions.db (Regenerable)

Tables

Key Schemas

Tier 1: platform.db (Read-Only)

Tables

Multi-Tenant Columns (ADR-119)

Standard Indexes

Full-Text Search (FTS5)

Query Examples

Python API

CLI Access

Migration

From context.db to Four-Tier

Add Multi-Tenant Columns

Backup

Related Documentation

Statistics (as of 2026-02-19)