ADR-194: Codi-Watcher v3.0 — Hash-Based Change Detection
Status
Accepted — February 13, 2026
Context
Codi-watcher v2.0 (ADR-134) monitors LLM session files across Claude, Codex, Gemini, and KIMI. When context usage thresholds are reached, it copies session files to pending directories for later manual processing via /cx.
This architecture has three problems:
-
Data redundancy: The copy-on-threshold approach creates 22GB/week of duplicate session files in
cusf-archive/, proven lossless by CUSF verification (every copied file is byte-identical to the source). -
Stale context: The user must manually run
/cxto extract context from copied files. Between file detection and manual extraction, the context database is stale — sometimes by hours or days. -
Complexity: The threshold system (context percentage parsing, 4 trigger types, cooldown management per trigger type) is fragile and requires parsing LLM-specific context usage formats that change between versions.
With ADR-181 (incremental /cx) now implemented, the unified message extractor can process only new/changed content via stat-based file classification and seek-based append extraction. The watcher no longer needs to copy files — it only needs to detect changes and trigger the extractor.
Decision
Replace the v2.0 threshold-based export pipeline with a v3.0 hash-based change detection system that automatically triggers incremental /cx.
Architecture Change
v2.0: Poll → Detect sessions → Check context% → Copy file → Pending dir → Manual /cx
v3.0: Poll → Discover files → Detect hash changes → Auto /cx --incremental → Log results
Key Design Decisions
1. Two-phase change detection (stat + hash)
Rather than hashing every file on every poll cycle, use a two-phase approach:
- Phase 1:
stat()each file for size + mtime (one syscall, O(1) per file) - Phase 2: SHA-256 hash only when stat differs from stored values
This makes the common case (no changes) extremely fast (~0.5-1s for a full poll) while still providing cryptographic certainty when changes are detected.
2. SHA-256 in-process (not shell-out)
Use the sha2 Rust crate for streaming hash computation in 64KB chunks. This avoids process spawn overhead per file and is consistent with ADR-182 (file integrity) which also uses SHA-256.
3. Shell-out to unified-message-extractor.py for /cx
The extractor is 3,000+ lines of battle-tested Python with SQLite integration, dedup stores, knowledge extraction, trajectory extraction, and MCP reindexing. Reimplementing in Rust would be massive scope creep. The watcher's job is detection + triggering, not extraction.
4. Per-LLM cooldown (not per-trigger-type)
v2.0 had cooldowns per trigger type (context%, size, time, turns). v3.0 simplifies to one cooldown per LLM: after /cx fires for an LLM, don't re-trigger within cooldown_seconds (default: 60s). This prevents /cx storms during rapid file changes.
5. Config schema v2.0.0 with backward compatibility
The new config replaces thresholds, triggers, and export blocks with a single trigger block. Old v1.0.0 config files are silently accepted (legacy fields ignored, defaults used for new fields).
6. State v3.0.0 with migration
State tracks per-file hashes instead of session cooldowns and export history. v1→v3 and v2→v3 migration paths exist for seamless upgrade.
Consequences
Positive
- Zero data redundancy: No file copies. 22GB/week savings.
- Near real-time context: Changes detected within poll interval (30s default), /cx triggered automatically. Context DB freshness goes from hours/days to minutes.
- Simpler architecture: No threshold parsing, no trigger types, no export pipeline. Just hash → detect → trigger.
- Faster common case: Poll cycle with no changes: ~0.5-1s (stat only) vs ~2-5s (stat + context% parse).
Negative
- Shell-out dependency: Relies on
unified-message-extractor.pybeing available and working. If the script breaks, auto-/cx breaks. - SHA-256 cost on change: Hashing a 50MB session file takes ~200ms. Acceptable for the uncommon case but adds latency to the trigger path.
Neutral
- CLI simplified: Removed
--threshold,--max-threshold,--multi-llmflags. Added--dry-run,--once,--force-cx,--status. - Dead code warnings: 29 Rust dead_code warnings remain for backward-compat legacy fields and future-use methods. These are intentional and expected.
Files
| File | Lines | Action |
|---|---|---|
tools/context-watcher/Cargo.toml | 49 | Modified (version bump, sha2 dep) |
tools/context-watcher/src/main.rs | 580 | Rewritten |
tools/context-watcher/src/trigger.rs | 487 | New (replaces export.rs) |
tools/context-watcher/src/monitor.rs | 815 | Rewritten |
tools/context-watcher/src/state.rs | 700 | Rewritten |
tools/context-watcher/src/config.rs | 541 | Rewritten |
tools/context-watcher/src/detection.rs | 619 | Minor updates |
tools/context-watcher/src/paths.rs | 195 | Unchanged |
config/llm-watchers.json | 94 | Modified (v2.0.0 schema) |
Deleted:
src/context_watcher.rs— Legacy single-LLM mode, fully supersededsrc/export.rs— Threshold-based export pipeline, replaced by trigger.rs
Verification
cargo build: Compiles clean (29 dead_code warnings, 0 errors)cargo test: 25/25 tests pass (config: 5, detection: 2, monitor: 4, state: 7, trigger: 5, paths: 3)
Track
J.13: Codi-Watcher v3.0 — Hash-Based Change Detection