ADR-149: Query Language Evolution Strategy
Status
ACCEPTED (2026-02-03)
Context
Problem Statement
The /cxq command has grown to support 94 tables across 4 databases with 100+ command flags. As capabilities expand, the current approach faces limitations:
| Gap | Example Need | Current Limitation |
|---|---|---|
| Complex Filters | "Decisions about APIs made in last 7 days with confidence > 0.8" | Requires chained flags or raw SQL |
| Cross-Table Joins | "Errors that occurred during tasks with low success scores" | No cross-table filtering |
| Aggregations | "Token cost by model by week" | Limited to pre-built views |
| Temporal Patterns | "Sessions where tool failures increased over time" | No trend detection |
| Graph Traversal | "All components that invoke security agents" | No graph query syntax |
| Pipeline | "Find errors → extract patterns → suggest fixes" | Multi-step manual process |
Current Query Capabilities
The /cxq command supports:
- FTS5 Full-Text Search:
word1 word2,word1 OR word2,"phrase",word*,NEAR(a b, 5),-word - Filter Flags:
--role,--llm,--today,--since,--session - Knowledge Queries:
--decisions,--patterns,--errors,--learnings - Analytics:
--tokens,--cost,--tools,--traces - Semantic Search:
--semantic,--recall - Multi-Database:
--multi-db,--scope-framework,--scope-project - Views: 18 pre-built database views
Options Considered
Option A: Continue Flag-Based Expansion
Add more --where, --group-by, --join flags.
Pros: Familiar, backward compatible Cons: Combinatorial explosion of flags, hard to compose
Option B: Full Domain-Specific Language (DSL)
Create CODITECT Query Language (CQL):
FROM decisions d
JOIN knowledge_graph kg ON d.id = kg.source_id
WHERE d.decision_type = 'architecture'
AND d.confidence > 0.7
ORDER BY d.created_at DESC
LIMIT 10
Pros: Powerful, expressive Cons: Learning curve, implementation complexity
Option C: Progressive Enhancement (Recommended)
Three-phase evolution:
- Phase 1: Enhanced flag syntax (immediate)
- Phase 2: Query templates/presets (medium-term)
- Phase 3: Optional DSL (if needed)
Pros: Incremental value, low risk, preserves compatibility Cons: May still need full DSL eventually
Decision
Implement Option C: Progressive Enhancement Strategy with three phases.
Phase 1: Enhanced Flag Syntax (Immediate - Q1 2026)
Add structured filtering without full DSL:
# Compound filters with --where
/cxq --decisions --where "confidence > 0.8 AND decision_type = 'api'" --since 7d
# Aggregation flags
/cxq --cost --group-by model,week --sort cost_desc
# Graph traversal shortcuts
/cxq --component security-specialist --traversal invokes:2 # 2 hops
# Field-specific search
/cxq --field "role:assistant,content:error"
# Cross-database correlation
/cxq --correlate "decisions,messages" --by session_id
Implementation:
- Add
--whereflag with simple expression parser - Add
--group-by,--sortflags for aggregations - Add
--traversalflag for knowledge graph - No new dependencies, pure Python
Phase 2: Query Templates (Medium-term - Q2 2026)
Create reusable query definitions:
# ~/.coditect-data/queries/high-cost-sessions.yaml
name: high-cost-sessions
description: Sessions with unusual token costs
version: 1.0.0
author: hal@az1.ai
query:
from: token_economics
join:
- table: messages
on: session_id
where:
- "cost_total_usd > 1.0"
group_by: session_id
order_by: cost_total_usd DESC
limit: 20
output:
format: table
columns:
- session_id
- total_cost
- message_count
Usage:
# Execute saved query
/cxq --query high-cost-sessions
# List available queries
/cxq --queries
# Create query interactively
/cxq --create-query
# Share query (export to clipboard)
/cxq --export-query high-cost-sessions
Implementation:
- YAML query definition schema
- Query template registry in org.db
- Validation against database schema
Phase 3: Optional DSL (Future - if needed)
If complexity warrants, introduce CODITECT Query Language (CQL):
-- CQL Example
FROM decisions d
LEFT JOIN knowledge_graph kg ON d.id = kg.source_id
WHERE d.decision_type IN ('architecture', 'api')
AND d.confidence > @threshold
AND d.created_at > @recent(7d)
AND kg.relationship = 'affects'
ORDER BY d.confidence DESC, d.created_at DESC
LIMIT @limit
-- Variables
@threshold = 0.7
@limit = 20
CQL Design Principles:
- SQL-Compatible Core: Leverage existing SQLite knowledge
- Domain Shorthands:
@recent(7d),@high-confidence,@my-decisions - Graph Extensions:
TRAVERSE component -> invokes -> 2 hops - Temporal Functions:
TREND(cost_usd, daily, last_30d) - Natural Language Bridge: "decisions about APIs" → CQL
LLM-Assisted Query Generation:
# Natural language to CQL
/cxq "find all architecture decisions that affected the authentication system"
# → Generates CQL, shows for approval, then executes
Decision Criteria for Phase 3
Move to Phase 3 (full DSL) if:
- Query template usage exceeds 50 unique templates
- Users frequently request joins across 3+ tables
- Complex temporal analysis becomes common
- Graph traversal depth regularly exceeds 2 hops
Implementation
Phase 1 Tasks (J.4.7)
| Task | Description | Priority |
|---|---|---|
| J.4.7.1 | Add --where flag with expression parser | P1 |
| J.4.7.2 | Add --group-by and --sort flags | P1 |
| J.4.7.3 | Add --traversal for knowledge graph | P2 |
| J.4.7.4 | Add --field for field-specific search | P2 |
| J.4.7.5 | Add --correlate for cross-table queries | P3 |
| J.4.7.6 | Update cxq.md documentation | P1 |
Phase 2 Tasks (J.4.8)
| Task | Description | Priority |
|---|---|---|
| J.4.8.1 | Create query template YAML schema | P1 |
| J.4.8.2 | Implement query template registry | P1 |
| J.4.8.3 | Add --query, --queries commands | P1 |
| J.4.8.4 | Create 20 standard query templates | P2 |
| J.4.8.5 | Add --create-query interactive mode | P3 |
Standard Query Templates (Phase 2)
| Template | Purpose |
|---|---|
daily-cost-report | Token costs by model for day |
weekly-activity | Messages, decisions, patterns by day |
error-frequency | Most common errors with solutions |
high-cost-sessions | Sessions exceeding cost threshold |
stuck-tasks | Tasks with high retry counts |
model-comparison | Cost/performance by LLM model |
component-usage | Most invoked agents/skills |
decision-impact | Decisions affecting most components |
learning-gaps | Skills with low effectiveness |
session-health | Sessions with high error rates |
Consequences
Positive
- Incremental value - Phase 1 delivers immediate improvements
- Low risk - No breaking changes to existing commands
- User feedback - Phase 2 templates capture common patterns
- Deferred complexity - Full DSL only if proven necessary
- LLM compatibility - Natural language can generate queries
Negative
- Multiple syntaxes - Users may need to learn flags + templates + (optionally) CQL
- Migration - Complex queries may need rewriting as capabilities evolve
Mitigations
- Document clear guidance on when to use each approach
- Provide migration tools between query formats
- LLM assistance for query generation reduces learning curve
Query Capability Matrix
| Capability | Current | Phase 1 | Phase 2 | Phase 3 |
|---|---|---|---|---|
| FTS5 search | Yes | Yes | Yes | Yes |
| Role filtering | Yes | Yes | Yes | Yes |
| Date filtering | Yes | Yes | Yes | Yes |
| LLM filtering | Yes | Yes | Yes | Yes |
| Semantic search | Yes | Yes | Yes | Yes |
| Compound WHERE | No | Yes | Yes | Yes |
| Aggregations | Limited | Yes | Yes | Yes |
| Graph traversal | No | Basic | Yes | Full |
| Saved queries | No | No | Yes | Yes |
| Cross-table joins | No | Basic | Yes | Full |
| Temporal analysis | No | No | Limited | Full |
| Natural language | No | No | No | Yes |
Related
- ADR-118: Four-Tier Database Architecture
- ADR-020: Context Extraction System (
/cx) - ADR-021: Context Query System (
/cxq) - ADR-148: Database Schema Documentation Standard
Changelog
| Date | Change |
|---|---|
| 2026-02-03 | Initial version - three-phase strategy |
Track: J.4 (Enhanced /cxq Query System) Tasks: J.4.7, J.4.8