Skip to main content

ADR-156: Project-Scoped Context Databases Architecture

Status

ACCEPTED (2026-02-03)

Context

Problem Statement

CODITECT's four-tier database architecture (ADR-118) stores context data in global databases without project isolation:

  • org.db (Tier 2): Decisions, skill_learnings, error_solutions - shared across all projects
  • sessions.db (Tier 3): Messages, tool_analytics, activity_associations - shared across all projects

This creates several problems for multi-project workflows:

  1. Decision pollution - Decisions made for one project (e.g., "use PostgreSQL for Avivatec") appear in context queries for all projects, potentially misleading agents working on different projects.

  2. Learning contamination - Skill learnings from customer projects may not apply to framework development, and vice versa. A learning like "Always use LGPD-compliant data handling" applies to Avivatec but not necessarily to all projects.

  3. Context query noise - /cxq --decisions returns decisions from all projects, reducing signal-to-noise ratio.

  4. Customer data mixing - Customer project data (potentially containing business logic, financial patterns, or proprietary information) is stored alongside internal project data.

  5. No project-specific context graphs - The context graph (ADR-151) cannot be scoped to a single project, returning nodes from unrelated projects.

Current State

DatabaseCurrent BehaviorLimitation
org.dbGlobal decisions, learningsNo project attribution
sessions.dbGlobal messages, analyticsNo project filtering
platform.dbComponent metadataN/A (project-agnostic by design)
projects.dbProject indexProject metadata only, not context

Requirements

  1. Project attribution for all context data (decisions, learnings, messages)
  2. Project-scoped queries for /cxq and context graph
  3. Backward compatibility with global context (for cross-project queries)
  4. Support for customer project isolation with tenant scoping
  5. Consistent with ADR-144 Project Registry and ADR-155 Session Logs

Decision

1. Project Attribution Strategy

Rather than creating separate databases per project (which would fragment data and complicate queries), we add project_id columns to existing tables. This enables both project-scoped queries AND cross-project queries when needed.

Approach: Column-based scoping with optional filtering, not database-per-project.

2. Schema Updates

org.db (Tier 2) Schema Extensions

-- ============================================================
-- ALTER: decisions table - add project_id column
-- ============================================================
ALTER TABLE decisions ADD COLUMN project_id TEXT;
ALTER TABLE decisions ADD COLUMN scope TEXT DEFAULT 'global'
CHECK (scope IN ('global', 'project', 'customer'));

-- Index for project-scoped queries
CREATE INDEX IF NOT EXISTS idx_decisions_project ON decisions(project_id);
CREATE INDEX IF NOT EXISTS idx_decisions_scope ON decisions(scope);

-- ============================================================
-- ALTER: skill_learnings table - add project_id column
-- ============================================================
ALTER TABLE skill_learnings ADD COLUMN project_id TEXT;
ALTER TABLE skill_learnings ADD COLUMN scope TEXT DEFAULT 'global'
CHECK (scope IN ('global', 'project', 'customer'));

CREATE INDEX IF NOT EXISTS idx_skill_learnings_project ON skill_learnings(project_id);

-- ============================================================
-- ALTER: error_solutions table - add project_id column
-- ============================================================
ALTER TABLE error_solutions ADD COLUMN project_id TEXT;
ALTER TABLE error_solutions ADD COLUMN scope TEXT DEFAULT 'global'
CHECK (scope IN ('global', 'project', 'customer'));

CREATE INDEX IF NOT EXISTS idx_error_solutions_project ON error_solutions(project_id);

-- ============================================================
-- View: project_decisions
-- Purpose: Decisions scoped to a specific project + global decisions
-- ============================================================
CREATE VIEW IF NOT EXISTS project_decisions AS
SELECT d.*, p.name AS project_name, p.scope AS project_scope
FROM decisions d
LEFT JOIN projects p ON d.project_id = p.project_id;

-- ============================================================
-- View: effective_decisions
-- Purpose: Get decisions applicable to a project (project-specific + global)
-- Parameters must be supplied via query: WHERE project_id = ? OR scope = 'global'
-- ============================================================

sessions.db (Tier 3) Schema Extensions

-- ============================================================
-- ALTER: messages table - add project_id column
-- (Also specified in ADR-155, repeated here for completeness)
-- ============================================================
ALTER TABLE messages ADD COLUMN project_id TEXT;

CREATE INDEX IF NOT EXISTS idx_messages_project ON messages(project_id);

-- ============================================================
-- ALTER: tool_analytics table - add project_id column
-- ============================================================
ALTER TABLE tool_analytics ADD COLUMN project_id TEXT;

CREATE INDEX IF NOT EXISTS idx_tool_analytics_project ON tool_analytics(project_id);

-- ============================================================
-- ALTER: activity_associations table - add project_id column
-- ============================================================
ALTER TABLE activity_associations ADD COLUMN project_id TEXT;

CREATE INDEX IF NOT EXISTS idx_activity_associations_project ON activity_associations(project_id);

-- ============================================================
-- View: project_activity
-- Purpose: Activity stream scoped to a project
-- ============================================================
CREATE VIEW IF NOT EXISTS project_activity AS
SELECT
aa.*,
p.name AS project_name,
p.scope AS project_scope,
m.content AS message_content
FROM activity_associations aa
LEFT JOIN projects p ON aa.project_id = p.project_id
LEFT JOIN messages m ON aa.message_id = m.id
WHERE aa.project_id IS NOT NULL;

3. Scope Inheritance Rules

Context data inherits scope from the active project:

Project ScopeDefault Data ScopeQueryable By
platformglobalAll projects
orgprojectSame org projects
customercustomerSame tenant only
projectprojectSame project only
def get_data_scope(project_id: str) -> str:
"""Determine data scope based on project scope."""
conn = sqlite3.connect(get_org_db_path())
result = conn.execute(
"SELECT scope FROM projects WHERE project_id = ?",
(project_id,)
).fetchone()

if not result:
return 'global' # No project = global scope

project_scope = result[0]
if project_scope == 'platform':
return 'global' # Platform decisions are global
elif project_scope == 'customer':
return 'customer' # Customer data is isolated
else:
return 'project' # Org and project are project-scoped

4. Query Patterns

Project-Scoped Query (Default for Active Project)

def query_decisions(project_id: str, include_global: bool = True) -> list:
"""Query decisions for a project, optionally including global decisions."""
conn = sqlite3.connect(get_org_db_path())

if include_global:
# Project-specific + global decisions
query = """
SELECT * FROM decisions
WHERE project_id = ? OR scope = 'global'
ORDER BY created_at DESC
"""
return conn.execute(query, (project_id,)).fetchall()
else:
# Project-specific only
query = """
SELECT * FROM decisions
WHERE project_id = ?
ORDER BY created_at DESC
"""
return conn.execute(query, (project_id,)).fetchall()

Cross-Project Query (Explicit)

def query_all_decisions(project_ids: list = None) -> list:
"""Query decisions across multiple projects."""
conn = sqlite3.connect(get_org_db_path())

if project_ids:
placeholders = ','.join('?' * len(project_ids))
query = f"""
SELECT * FROM decisions
WHERE project_id IN ({placeholders}) OR scope = 'global'
ORDER BY created_at DESC
"""
return conn.execute(query, project_ids).fetchall()
else:
# All decisions (admin view)
return conn.execute("SELECT * FROM decisions ORDER BY created_at DESC").fetchall()

Customer-Isolated Query

def query_customer_decisions(tenant_id: str) -> list:
"""Query decisions for a specific customer tenant."""
conn = sqlite3.connect(get_org_db_path())

query = """
SELECT d.* FROM decisions d
JOIN projects p ON d.project_id = p.project_id
WHERE p.tenant_id = ? AND d.scope = 'customer'
ORDER BY d.created_at DESC
"""
return conn.execute(query, (tenant_id,)).fetchall()

5. Command Updates

/cxq Enhancement

# Default: Query active project + global
/cxq --decisions
# Returns: Project-scoped decisions + global decisions

# Explicit project filter
/cxq --decisions --project CUST-avivatec-fpa
# Returns: Only Avivatec decisions + global decisions

# Exclude global (project-only)
/cxq --decisions --project-only
# Returns: Only decisions with current project_id

# Cross-project query (explicit)
/cxq --decisions --all-projects
# Returns: All decisions (requires admin context)

# Customer isolation
/cxq --decisions --tenant abc-123-uuid
# Returns: Only tenant's decisions (customer scope)

/cx Enhancement

# Extract with project attribution
/cx --project CUST-avivatec-fpa
# Attributes all extracted messages to project

# Auto-detect project from exports directory
/cx
# If processing from sessions-export-pending/CUST-avivatec-fpa/,
# attributes to that project

Decision Recording

When recording decisions via agent workflows:

def record_decision(
title: str,
content: str,
project_id: str = None,
scope: str = None
):
"""Record a decision with project attribution."""
if project_id is None:
project_id = os.environ.get('CODITECT_PROJECT')

if scope is None:
scope = get_data_scope(project_id) if project_id else 'global'

conn = sqlite3.connect(get_org_db_path())
conn.execute("""
INSERT INTO decisions (title, content, project_id, scope, created_at)
VALUES (?, ?, ?, ?, datetime('now'))
""", (title, content, project_id, scope))
conn.commit()

6. Context Graph Integration (ADR-151)

The context graph builder receives project context:

def build_context_graph(task: str, project_id: str = None) -> ContextGraph:
"""Build context graph scoped to project."""
if project_id is None:
project_id = os.environ.get('CODITECT_PROJECT')

# Seed nodes are filtered by project
seed_nodes = get_seed_nodes(task, project_id=project_id)

# Expansion respects project boundaries
graph = expand_graph(seed_nodes, project_id=project_id)

return graph

def get_seed_nodes(task: str, project_id: str) -> list:
"""Get seed nodes for graph, scoped to project + global."""
conn = sqlite3.connect(get_org_db_path())

# Decisions for this project + global
decisions = conn.execute("""
SELECT * FROM decisions
WHERE project_id = ? OR scope = 'global'
""", (project_id,)).fetchall()

# Learnings for this project + global
learnings = conn.execute("""
SELECT * FROM skill_learnings
WHERE project_id = ? OR scope = 'global'
""", (project_id,)).fetchall()

return create_nodes(decisions, learnings)

7. Data Flow Architecture

┌─────────────────────────────────────────────────────────────────┐
│ Context Data Flow │
│ │
│ ┌─────────────────┐ │
│ │ Active Project │ $CODITECT_PROJECT=CUST-avivatec-fpa │
│ └────────┬────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Data Operations │ │
│ │ │ │
│ │ WRITE (with project attribution) │ │
│ │ ├─ /session-log → messages.project_id = project │ │
│ │ ├─ record_decision → decisions.project_id = project │ │
│ │ └─ record_learning → learnings.project_id = project │ │
│ │ │ │
│ │ READ (with project filtering) │ │
│ │ ├─ /cxq --decisions → project + global │ │
│ │ ├─ /cxq --graph → project-scoped graph │ │
│ │ └─ context loading → project + global learnings │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Database Layer │ │
│ │ │ │
│ │ org.db (Tier 2 - Irreplaceable) │ │
│ │ ├─ decisions (project_id, scope) │ │
│ │ ├─ skill_learnings (project_id, scope) │ │
│ │ ├─ error_solutions (project_id, scope) │ │
│ │ └─ projects (registry) │ │
│ │ │ │
│ │ sessions.db (Tier 3 - Regenerable) │ │
│ │ ├─ messages (project_id) │ │
│ │ ├─ tool_analytics (project_id) │ │
│ │ └─ activity_associations (project_id) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

8. Migration Strategy

Phase 1: Schema Addition (Non-Breaking)

-- Run on org.db
ALTER TABLE decisions ADD COLUMN project_id TEXT;
ALTER TABLE decisions ADD COLUMN scope TEXT DEFAULT 'global';
ALTER TABLE skill_learnings ADD COLUMN project_id TEXT;
ALTER TABLE skill_learnings ADD COLUMN scope TEXT DEFAULT 'global';
ALTER TABLE error_solutions ADD COLUMN project_id TEXT;
ALTER TABLE error_solutions ADD COLUMN scope TEXT DEFAULT 'global';

-- Create indexes
CREATE INDEX IF NOT EXISTS idx_decisions_project ON decisions(project_id);
-- ... (additional indexes)

-- Run on sessions.db
ALTER TABLE messages ADD COLUMN project_id TEXT;
ALTER TABLE tool_analytics ADD COLUMN project_id TEXT;
ALTER TABLE activity_associations ADD COLUMN project_id TEXT;

Phase 2: Backfill Existing Data

def backfill_project_attribution():
"""
Attribute existing context data to projects based on heuristics:
1. Task IDs (e.g., FPA.A.1.1 → CUST-avivatec-fpa)
2. File paths mentioned (e.g., coditect-jv-avivatec → CUST-avivatec-fpa)
3. Session association (messages from project-scoped sessions)
4. Keywords and patterns
"""
pass

Phase 3: Write Path Updates

  1. Update decision recording to include project_id
  2. Update learning extraction to include project_id
  3. Update /cx to attribute messages to projects
  4. Update /session-log integration

Phase 4: Read Path Updates

  1. Update /cxq to support project filtering
  2. Update context graph builder for project scoping
  3. Update agent context loading

9. Customer Data Isolation Guarantees

For customer scope projects:

GuaranteeImplementation
Data never leaves tenant boundaryWHERE tenant_id = ? on all customer queries
No cross-tenant joinsQueries validated before execution
Audit loggingAll customer data access logged
Data deletionDELETE WHERE project_id = ? removes all customer data
Backup isolationCustomer data backed up to tenant-specific storage

Consequences

Positive

  1. Project-focused context - Agents receive only relevant decisions and learnings
  2. Customer data isolation - Tenant data is isolated at the database level
  3. Cleaner context graphs - Graphs are scoped to project, reducing noise
  4. Cross-project capability preserved - Global scope enables knowledge sharing
  5. Backward compatible - Existing global data continues to work
  6. Query flexibility - Support for both scoped and cross-project queries

Negative

  1. Schema complexity - Additional columns and indexes on multiple tables
  2. Migration effort - Existing data needs attribution
  3. Query overhead - Project filtering adds WHERE clauses
  4. Scope decisions - Must decide scope for each piece of data

Risks and Mitigations

RiskLikelihoodMitigation
Incorrect project attributionMediumValidation on write; backfill review
Query performance degradationLowProper indexing; query analysis
Global data orphaningLowDefault scope = 'global' for unattributed data
Customer data leakageLowStrict tenant_id filtering; audit logging

Implementation Plan

Week 1: Schema Updates

  • Add project_id and scope columns to org.db tables
  • Add project_id column to sessions.db tables
  • Create indexes and views
  • Update database initialization scripts

Week 2: Write Path

  • Update decision recording with project attribution
  • Update learning extraction with project attribution
  • Update /cx message extraction
  • Implement scope determination logic

Week 3: Read Path

  • Update /cxq for project filtering
  • Update context graph builder (ADR-151)
  • Update agent context loading
  • Add --project flags to commands

Week 4: Migration & Validation

  • Implement backfill script
  • Run migration on existing data
  • Test customer isolation
  • Performance testing
  • Update documentation
ADRRelationship
ADR-118Base database architecture being extended
ADR-144Project registry for project_id resolution
ADR-151Context graph receives project scoping
ADR-155Session logs also scoped to projects