ADR-043: Bidirectional Context Memory Sync Architecture
Status: Accepted SUPERSEDED
Date: 2025-12-29
Superseded By: ADR-044 - Custom REST Sync Architecture
Author: Hal Casteel
Stakeholders: Platform Engineering, Backend, Frontend, DevOps
⚠️ SUPERSESSION NOTICE (2025-12-29)
This ADR has been superseded by ADR-044 due to FSL license concerns.
Key Finding: PowerSync uses the Functional Source License (FSL) which includes a "Competing Use" restriction. JourneyApps (PowerSync owner) offers a mobile app builder with offline sync - potentially in competition with CODITECT's development platform.
Legal Risk: Using PowerSync in CODITECT SaaS could constitute license violation.
Replacement Architecture: Custom REST API sync (see ADR-044)
- Same SQLite client-side storage
- Django/FastAPI REST endpoints for push/pull
- PostgreSQL as source of truth
- Zero third-party sync dependencies
5-Year TCO Savings: $12,900 (Custom REST $41,400 vs PowerSync $54,300)
Context
CODITECT manages AI session context memory across multiple machines, teams, and tenants. The context memory originates from Claude Code sessions stored locally as JSONL files, processed into a local SQLite database, and must be synchronized bidirectionally with a central PostgreSQL database that serves as the primary record of truth.
The Problem
Users run Claude Code on multiple machines (desktop, laptop, cloud workstations). Each machine generates:
~/.claude/projects/*/*.jsonl- Raw Claude Code session files- CODITECT processes these into unique messages in local
context.db(SQLite)
Without synchronization:
- Context memory is fragmented across machines
- Teams cannot share learned patterns and decisions
- Multi-tenant SaaS requires centralized data management
Requirements
- Local → Cloud Push: SQLite changes push to PostgreSQL (cloud master)
- Cloud → Local Sync: PostgreSQL syncs to all SQLite clients
- PostgreSQL as Source of Truth: Central authority for all context data
- Multi-client Merge: Multiple SQLite instances write to same PostgreSQL
- Multi-tenant Isolation: Tenant data completely separated
- Offline Support: Local-first with sync on reconnect
- Conflict Resolution: Handle concurrent writes from multiple clients
Decision
Implement a bidirectional sync architecture using:
- PowerSync for PostgreSQL → SQLite downstream sync
- REST API for SQLite → PostgreSQL upstream writes
- PostgreSQL as the central source of truth
Architecture
Complete Data Flow
┌─────────────────────────────────────────────────────────────────────────────┐
│ LOCAL CLIENT (per machine) │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ ~/.claude/ │ │ CODITECT │ │ Local SQLite │ │
│ │ projects/ │───▶│ Session │───▶│ context.db │ │
│ │ *.jsonl │ │ Processor │ │ │ │
│ │ │ │ │ │ - messages │ │
│ │ (Claude Code │ │ - Deduplication │ │ - decisions │ │
│ │ raw sessions) │ │ - Extraction │ │ - patterns │ │
│ └──────────────────┘ │ - Enrichment │ │ - errors │ │
│ └──────────────────┘ └────────┬─────────┘ │
│ │ │
│ PUSH │ PULL │
│ (API) │ (PowerSync) │
│ │ │
└───────────────────────────────────────────────────────────┼─────────────────┘
│
┌──────────────────────────────┴─────────────────┐
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ GKE CLOUD CLUSTER │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ CODITECT API │ │ PostgreSQL │ │
│ │ (Backend) │────────▶│ (Cloud SQL) │◀────┐ │
│ │ │ writes │ │ │ │
│ │ POST /api/sync │ │ PRIMARY RECORD │ │ Logical │
│ │ - messages │ │ OF TRUTH │ │ Replication │
│ │ - decisions │ │ │ │ │
│ │ - patterns │ │ - All tenants │ │ │
│ └──────────────────┘ │ - All users │ │ │
│ │ - All machines │ │ │
│ │ - Merged data │ │ │
│ └────────┬─────────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌──────────────────┐ │ │
│ │ PowerSync │─────┘ │
│ │ Service │ │
│ │ │ │
│ │ sync.coditect.ai│ │
│ └────────┬─────────┘ │
│ │ │
└────────────────────────────────────────┼────────────────────────────────────┘
│
│ Sync Protocol (downstream)
│
┌───────────────────────────────┼───────────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Machine A │ │ Machine B │ │ Machine C │
│ (Desktop) │ │ (Laptop) │ │ (Team Member) │
│ │ │ │ │ │
│ SQLite ◀──────┼────────────┼─────────────────┼────────────┼──────▶ SQLite │
│ context.db │ │ SQLite │ │ context.db │
│ │ │ context.db │ │ │
│ Same user, │ │ Same user, │ │ Same tenant, │
│ different │ │ different │ │ different │
│ machine │ │ machine │ │ user │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Sequence Diagram
Multi-Tenant Data Flow
Implementation Details
1. Upstream Push (SQLite → PostgreSQL)
Endpoint: POST /api/sync/push
# Client-side push logic
async def push_to_cloud(local_db: sqlite3.Connection):
# Get unpushed changes since last sync
changes = local_db.execute("""
SELECT * FROM messages
WHERE updated_at > ? AND sync_status = 'pending'
ORDER BY updated_at
LIMIT 1000
""", [last_sync_timestamp])
# Push to cloud API
response = await api.post('/api/sync/push', {
'tenant_id': config.tenant_id,
'user_id': config.user_id,
'machine_id': config.machine_id,
'changes': [dict(row) for row in changes]
})
# Mark as synced
for change in changes:
local_db.execute("""
UPDATE messages SET sync_status = 'synced',
cloud_id = ? WHERE id = ?
""", [response.cloud_ids[change.id], change.id])
Server-side merge logic:
# Backend API handler
@app.post('/api/sync/push')
async def handle_push(request: SyncPushRequest):
for change in request.changes:
# Upsert with conflict resolution
await db.execute("""
INSERT INTO messages (id, tenant_id, user_id, content, ...)
VALUES ($1, $2, $3, $4, ...)
ON CONFLICT (tenant_id, content_hash) DO UPDATE SET
updated_at = GREATEST(messages.updated_at, EXCLUDED.updated_at),
-- Merge metadata, keep newer
metadata = messages.metadata || EXCLUDED.metadata
WHERE EXCLUDED.updated_at > messages.updated_at
""", change)
return {'status': 'ok', 'merged': len(request.changes)}
2. Downstream Sync (PostgreSQL → SQLite)
PowerSync Sync Rules (sync_rules.yaml):
bucket_definitions:
# User's own messages (all machines)
- name: user_messages
parameters:
- name: tenant_id
- name: user_id
data:
- SELECT id, content, role, tokens, model, session_id,
created_at, updated_at
FROM messages
WHERE tenant_id = bucket.tenant_id
AND user_id = bucket.user_id
ORDER BY created_at DESC
# User's decisions
- name: user_decisions
parameters:
- name: tenant_id
- name: user_id
data:
- SELECT id, title, description, context, status,
created_at, updated_at
FROM decisions
WHERE tenant_id = bucket.tenant_id
AND user_id = bucket.user_id
# Shared team patterns (read-only for team members)
- name: team_patterns
parameters:
- name: tenant_id
data:
- SELECT id, pattern, language, frequency, last_seen_at
FROM code_patterns
WHERE tenant_id = bucket.tenant_id
AND is_shared = true
3. Conflict Resolution Strategy
| Scenario | Resolution |
|---|---|
| Same message from multiple machines | Deduplicate by content_hash |
| Concurrent edits to same record | Last-write-wins by updated_at |
| Delete vs Update conflict | Update wins (soft delete) |
| New record with same ID | Use UUID, no collision |
4. Sync State Tracking
-- Local SQLite: Track sync state
CREATE TABLE sync_state (
id INTEGER PRIMARY KEY,
last_push_at TIMESTAMP,
last_pull_at TIMESTAMP,
last_push_cursor TEXT,
last_pull_cursor TEXT,
pending_count INTEGER DEFAULT 0
);
-- PostgreSQL: Track per-client sync
CREATE TABLE client_sync_state (
tenant_id TEXT NOT NULL,
user_id TEXT NOT NULL,
machine_id TEXT NOT NULL,
last_seen_at TIMESTAMP DEFAULT NOW(),
last_push_at TIMESTAMP,
sync_cursor TEXT,
PRIMARY KEY (tenant_id, user_id, machine_id)
);
Component Roles
| Component | Role | Direction |
|---|---|---|
| Claude Code | Generate raw session data | Source |
| CODITECT Processor | Deduplicate, extract, enrich | Local transform |
| Local SQLite | Local context storage | Local cache |
| CODITECT API | Handle push requests | Upstream gateway |
| PostgreSQL | Primary record of truth | Central authority |
| PowerSync | Stream changes to clients | Downstream sync |
| Litestream | Backup local SQLite to GCS | Disaster recovery |
API Endpoints
| Endpoint | Method | Purpose |
|---|---|---|
/api/sync/push | POST | Push local changes to cloud |
/api/sync/status | GET | Get sync state for client |
/api/sync/resolve | POST | Manually resolve conflicts |
sync.coditect.ai/* | WS | PowerSync streaming endpoint |
Rationale
Why This Architecture?
- PostgreSQL as master - Proven ACID compliance, rich querying, tenant isolation
- PowerSync for downstream - Purpose-built for PostgreSQL → SQLite sync
- REST API for upstream - Simple, debuggable, controllable write path
- Local SQLite - Fast local queries, offline support
Why Not Pure PowerSync Bidirectional?
PowerSync supports writes via their SDK, but we chose REST API for upstream because:
- More control over conflict resolution
- Easier to add business logic (validation, enrichment)
- Better audit logging
- Simpler debugging
Consequences
Positive
- Unified context across all user machines
- Team collaboration via shared patterns
- Multi-tenant isolation at database level
- Offline-first with eventual consistency
- Scalable with PostgreSQL + PowerSync
Negative
- Eventual consistency (not real-time for writes)
- Complexity of sync state management
- Infrastructure cost (PostgreSQL + PowerSync + MongoDB)
Risks & Mitigations
| Risk | Mitigation |
|---|---|
| Data loss during sync | Litestream backup, sync acknowledgments |
| Sync conflicts | Content-hash dedup, last-write-wins |
| Network partitions | Queue writes locally, sync on reconnect |
| Large datasets | Pagination, incremental sync |
Files
| File | Purpose |
|---|---|
config/powersync/powersync.yaml | PowerSync service config |
config/powersync/sync_rules.yaml | Multi-tenant sync rules |
config/powersync/kubernetes/ | GKE deployment manifests |
scripts/context-sync.py | Local sync client (TODO) |
submodules/integrations/powersync-service | PowerSync source |
Related ADRs
- ADR-042: Litestream for SQLite backup to GCS
- ADR-014: Multi-Product Commerce Architecture
- ADR-CX: Context Extraction System