Skip to main content

Inter-Session Messaging Architecture - MoE Final Verdict

Task: H.13 - Inter-Session Communication Layer Date: 2026-02-06 Author: Claude (Opus 4.6) MoE Phase: 4 of 4 (Final Verdict)


Process Summary

Research Phase (3 Parallel Agents)

AgentScopeKey Finding
Agent 1 (Motia Deep-Dive)Architecture, license, maturity, performance analysis of MotiaNOT SUITABLE: In-process event bus, ELv2 license, Rust rewrite incoming, 1.5/5 overall fit
Agent 2 (Alternatives)6 alternatives profiled with code, benchmarks, license analysisSQLite best fit: Zero deps, ACID, aligns with ADR-118. File JSON is fallback
Agent 3 (Industry Landscape)8 competing tools, MCP/A2A protocols, market patternsIndustry consensus: File-based or SQLite. No leading AI coding tool uses message brokers for this

Evaluation Matrix (5 Weighted Sub-Tables)

7 candidates scored across 34 attributes in 5 categories:

CategoryWeightPurpose
Technical Fit30%Does it solve the problem?
Operational25%How hard to deploy/maintain?
Strategic Fit20%Does it align with CODITECT's direction?
Risk Assessment15%What could go wrong?
Long-Term Value10%Will it still be right in 3 years?

Judge Panel (3 Perspectives)

JudgePerspectiveModelConfidenceAgrees?
Judge 1Technical Architecture & RiskClaude Opus 4.682%Yes
Judge 2Systems Engineering & IntegrationKimi k2.5Yes, with caveatsYes
Judge 3Industry Ecosystem & LongevityGemini 2.5 Pro88%Yes

Final Ranking (Post-Judge Adjustments)

RankSolutionOriginal ScoreJudge-Adjusted ScoreVerdict
1SQLite Pub/Sub (messaging.db)94.6%92.7%Clear winner
2File-based JSON Manifest79.6%79.6%Strong runner-up
3NATS.io76.2%77.0%Best "real" broker -- overkill
4Unix Domain Sockets73.8%75.7%Best latency -- daemon overhead
5Redis Pub/Sub73.9%74.8%Excellent -- license risk
6Claude Code Agent Teams51.5%52.1%Claude-only (disqualified)
7Motia Framework40.1%40.1%Wrong architecture (disqualified)

Winner: SQLite with dedicated messaging.db + kqueue/inotify hybrid notification


Decision

What We Will Build

A lightweight inter-session coordination layer using:

  1. messaging.db -- A new, small (<1 MB), dedicated SQLite database in WAL mode, separate from sessions.db (18.4 GB)
  2. kqueue/inotify notification -- OS-level file system watchers on the WAL file for near-instant push notifications (macOS: kqueue, Linux: inotify)
  3. MessageBus abstraction -- Clean Python interface enabling future transport replacement for CODITECT Cloud

What We Will NOT Build

  • No Motia integration (wrong architecture, ELv2 license)
  • No external message broker (Redis, NATS, RabbitMQ)
  • No daemon process (no SPOF, no new LaunchAgent)
  • No dependency on Claude Code Agent Teams (Claude-only)

Why SQLite Over File-Based JSON

Despite the industry pattern of file-based coordination (Agent Teams, Cursor, Aider), SQLite wins because CODITECT's problem is harder than theirs:

Competitor ProblemCODITECT Problem
Coordinate 2-5 Claude sessionsCoordinate 5-10 sessions across 4+ LLM vendors
Same-vendor, same-protocolMixed vendors, mixed protocols
Single team lead, hierarchicalPeer-to-peer, decentralized
File conflicts are rare (worktrees)File conflicts are the primary risk

SQLite provides ACID guarantees, crash recovery, message ordering (ROWID), and concurrent access control that file-based patterns cannot. The 13-point gap (92.7% vs 79.6%) is justified by the problem complexity difference.


Key Judge Findings Incorporated

From Judge 1 (Technical Architecture)

  1. Mandatory MessageBus abstraction -- Clean separation between API and transport for cloud upgrade
  2. Poll interval configuration -- Default 250ms, configurable per deployment
  3. SQLITE_BUSY retry logic -- Exponential backoff with jitter on all write paths
  4. Message TTL and cleanup -- Automatic purging to prevent unbounded growth
  5. Benchmark gate -- Required: 10 concurrent sessions, 250ms polling, measure CPU and p99 latency

From Judge 2 (Systems Engineering)

  1. CRITICAL: Use separate messaging.db, NOT sessions.db -- Sessions.db is 18.4 GB with 80+ tables. Adding pub/sub polling would create SQLITE_BUSY contention with context extraction and tool analytics
  2. Hybrid kqueue notification -- Watch WAL file for changes, eliminate polling latency entirely
  3. No retry logic exists in codebase today -- Must be added to messaging.db at minimum
  4. Existing message_bus.py (RabbitMQ) is dead code -- CODITECT already tried the broker path and abandoned it

From Judge 3 (Industry Ecosystem)

  1. Agent Teams composability -- Design as participant in coordination, not competitor
  2. Cloud evolution phases: SQLite (launch) -> PostgreSQL LISTEN/NOTIFY (team features) -> optional broker (1000+ sessions)
  3. Standards watch -- Monitor MCP and A2A for inter-session relevance (12-18 months)
  4. Competitive differentiation is the capability, not the transport -- Focus engineering on multi-LLM coordination features

Architecture Overview

┌──────────────────────────────────────────────────────────────┐
│ Developer Machine │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Claude │ │ Codex │ │ Gemini │ ...more │
│ │ Session │ │ Session │ │ Session │ │
│ └─────┬──────┘ └─────┬──────┘ └─────┬──────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────┐ │
│ │ MessageBus (Python API) │ │
│ │ publish() / subscribe() / poll() │ │
│ └─────────────────┬───────────────────────────┘ │
│ │ │
│ ┌───────────┴───────────┐ │
│ │ messaging.db │ ← kqueue/inotify watch │
│ │ (WAL mode, <1 MB) │ on WAL file changes │
│ │ │ │
│ │ inter_session_msgs │ │
│ │ session_registry │ │
│ │ file_locks │ │
│ └──────────────────────┘ │
│ │
│ Existing (unchanged): │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │platform │ │ org.db │ │sessions │ │context.db│ │
│ │.db (T1) │ │ (T2) │ │.db (T3) │ │(LEGACY) │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└──────────────────────────────────────────────────────────────┘

Database Schema (messaging.db)

-- Session registry
CREATE TABLE session_registry (
session_id TEXT PRIMARY KEY,
llm_vendor TEXT NOT NULL, -- claude, codex, gemini, kimi
llm_model TEXT, -- opus-4.6, o3, 2.5-pro, k2.5
tty TEXT,
pid INTEGER,
project_id TEXT,
task_id TEXT,
active_files TEXT, -- JSON array of files being edited
heartbeat_at TEXT NOT NULL,
registered_at TEXT NOT NULL DEFAULT (datetime('now')),
status TEXT DEFAULT 'active' -- active, idle, terminated
);

-- Inter-session messages
CREATE TABLE inter_session_messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
sender_id TEXT NOT NULL,
channel TEXT NOT NULL, -- state, file_conflict, task_broadcast, heartbeat
payload TEXT NOT NULL, -- JSON
created_at TEXT NOT NULL DEFAULT (datetime('now')),
ttl_seconds INTEGER DEFAULT 300, -- 5 minute default TTL
FOREIGN KEY (sender_id) REFERENCES session_registry(session_id)
);

-- File lock tracking
CREATE TABLE file_locks (
file_path TEXT PRIMARY KEY,
session_id TEXT NOT NULL,
lock_type TEXT DEFAULT 'advisory', -- advisory, exclusive
locked_at TEXT NOT NULL DEFAULT (datetime('now')),
FOREIGN KEY (session_id) REFERENCES session_registry(session_id)
);

-- Indexes
CREATE INDEX idx_messages_channel_created ON inter_session_messages(channel, created_at);
CREATE INDEX idx_messages_ttl ON inter_session_messages(created_at, ttl_seconds);
CREATE INDEX idx_registry_status ON session_registry(status);
CREATE INDEX idx_registry_heartbeat ON session_registry(heartbeat_at);

Message Channels

ChannelPurposeTTLExample Payload
stateSession status broadcasts60s{"task_id": "H.8.1", "status": "working"}
file_conflictFile edit conflict warnings300s{"file": "paths.py", "sessions": ["s1", "s2"]}
task_broadcastTask routing between sessions600s{"task_id": "A.9.1", "action": "claim"}
heartbeatSession liveness detection30s{"session_id": "s1", "alive": true}

Implementation Plan

PhaseTaskEffortMilestone
1Create scripts/core/message_bus.py with abstraction interface2 daysMessageBus ABC + SQLiteMessageBus
2Create messaging.db initialization in paths.py0.5 dayget_messaging_db_path()
3Add kqueue/inotify notification watcher1.5 daysNear-instant push on WAL change
4Session registration hook (PreToolUse)1 dayAuto-register on first tool use
5File conflict detection1 dayAdvisory locks + conflict channel
6Benchmark with 10 concurrent sessions0.5 dayValidate <50ms p99, <5% CPU
7Write ADR-1600.5 dayArchitecture documented
Total7 days

Risks and Mitigations

RiskProbabilitySeverityMitigation
kqueue not available (non-macOS)LowMediumFallback to 250ms polling with watchdog library
SQLite BUSY under loadLowMediumSeparate messaging.db + exponential backoff retry
Message table growthMediumLowTTL-based cleanup cron, 5-minute default TTL
Schema migrationLowLowmessaging.db version tracked in _schema_version table
Cloud upgrade complexityMediumMediumMessageBus abstraction enables transport swap

Deliverables Checklist

  • Evaluation matrix with 5 weighted sub-tables (34 attributes, 7 candidates)
  • 3-judge MoE panel review (unanimous agreement on ranking)
  • Final verdict document (this file)
  • ADR-160: Inter-Session Messaging Architecture
  • Session log entry with findings
  • TRACK-H update with H.13 task definitions

Documents Produced

DocumentPath
Evaluation Matrixinternal/analysis/inter-session-messaging/evaluation-matrix.md
Final Verdictinternal/analysis/inter-session-messaging/final-verdict.md
Judge 3 Analysisinternal/analysis/inter-session-messaging/judge-3-industry-ecosystem-analysis.md
ADR-160internal/architecture/adrs/ADR-160-inter-session-messaging-architecture.md (pending)