ADR-080: MCP Semantic Search with Hybrid RRF Fusion

Document: ADR-080-mcp-semantic-search-hybrid-rrf
Version: 1.0.0
Purpose: Document architectural decision for MCP-based semantic search with hybrid RRF fusion
Audience: Framework contributors, AI researchers, system architects
Date Created: 2026-01-17
Last Updated: 2026-01-17
Status: ACCEPTED
Task ID: H.5.5.1
Related ADRs: ADR-020 (Context Extraction), ADR-021 (Context Query), ADR-003 (ChromaDB)
Related Documents:
  - tools/mcp-semantic-search/server.py
  - tools/mcp-semantic-search/CLAUDE.md
  - tests/tools/test_mcp_semantic_search.py

Context and Problem Statement

Background

The CODITECT framework has existing semantic search capabilities implemented in scripts/context-db.py (documented in ADR-021). However, these capabilities are:

CLI-bound: Only accessible via /cxq command, not programmatically by AI agents
Single-mode: FTS5 and semantic search operate independently, not combined
Not MCP-enabled: Cannot be used as MCP tools by Claude Code or other MCP clients

Problem Statement

How do we expose semantic search capabilities to AI agents via MCP while improving search quality through hybrid fusion algorithms?

Requirements

Must-Have:

MCP protocol compliance for tool integration
Combine FTS5 and vector search for better results
Use existing infrastructure (context.db, embeddings)
Sub-second query latency for interactive use

Should-Have:

Configurable fusion weights
Multiple search modes (hybrid, semantic-only, keyword-only)
Decision and error knowledge base search

Nice-to-Have:

Streaming results
Search analytics
Result caching

Decision Drivers

Technical Constraints

T1: Must use existing SQLite database (context.db)
T2: Must use existing embeddings (all-MiniLM-L6-v2, 384 dimensions)
T3: Must be deployable without additional infrastructure
T4: Must support Claude Code MCP integration

Performance Goals

P1: Query latency <500ms for interactive use
P2: Handle 143K+ messages corpus efficiently
P3: Minimal memory footprint (<200MB)

User Experience

UX1: Consistent results across search modes
UX2: Relevance ranking that makes sense to users
UX3: Easy integration with existing workflows

Considered Options

Option A: ChromaDB Migration (Full Rewrite)

Description: Migrate from SQLite to ChromaDB, a purpose-built vector database.

Pros:

Native vector operations optimized for similarity search
Built-in HNSW indexing for scalability
Growing ecosystem and community

Cons:

Requires data migration from SQLite
Loses FTS5 capabilities (weaker text search)
Additional dependency and infrastructure
Significant implementation effort (16-24 hours)

Estimated Effort: 16-24 hours

Rejected: Too disruptive, loses FTS5 strengths, high effort.

Option B: Enhanced Hybrid Search (RRF Fusion) - SELECTED

Description: Add MCP server that combines existing FTS5 and vector search using Reciprocal Rank Fusion (RRF).

Pros:

Uses existing infrastructure (no migration)
Combines strengths of both search methods
RRF is proven algorithm (used by Elastic, Cohere)
Moderate implementation effort (6-8 hours)
MCP-native for AI agent access

Cons:

Brute-force vector search (no ANN indexing)
Limited to current scale (~150K messages)

Estimated Effort: 6-8 hours

Selected: Best value/effort ratio, builds on existing infrastructure.

Option C: Full AST-Based Chunking + Dedicated Vector Store

Description: Rewrite context extraction with AST-based chunking for code-aware semantic search, plus dedicated vector store.

Pros:

Optimal semantic chunks for code (functions, classes, methods)
Improved code search relevance
Future-proof architecture

Cons:

Requires rewriting extraction pipeline
Language-specific AST parsers needed
Significant implementation effort (16-24 hours)
Breaking change to existing data

Estimated Effort: 16-24 hours

Deferred: Good future direction (H.5.5.2-H.5.5.4), but too much scope for H.5.5.1.

Decision Outcome

CHOSEN: Option B - Enhanced Hybrid Search with RRF Fusion

Rationale

Value vs Effort: Option B delivers 80% of the value at 30% of the effort
Incremental: Builds on existing infrastructure without migration risk
MCP-First: Designed for AI agent access from the start
Proven Algorithm: RRF is battle-tested in production systems

Future Path

Option B enables Option C as a future enhancement:

H.5.5.2: Add call graph navigation tools
H.5.5.3: Create impact analysis MCP tool
H.5.5.4: Add document RAG integration (potential AST chunking)

Technical Implementation

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    MCP SEMANTIC SEARCH SERVER                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  MCP Client (Claude Code)                                        │
│       │                                                          │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ MCP Tools                                                   ││
│  │ • hybrid_search     - FTS5 + Vector with RRF fusion         ││
│  │ • semantic_search   - Vector similarity only                ││
│  │ • keyword_search    - FTS5 only                             ││
│  │ • search_decisions  - Decision knowledge base               ││
│  │ • search_errors     - Error-solution pairs                  ││
│  │ • context_stats     - Database statistics                   ││
│  └─────────────────────────────────────────────────────────────┘│
│       │                                                          │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ Reciprocal Rank Fusion (RRF)                                ││
│  │                                                              ││
│  │   RRF_score = Σ (weight_i / (k + rank_i))                   ││
│  │                                                              ││
│  │   k = 60 (standard constant)                                ││
│  │   Default weights: FTS=0.4, Vector=0.6                      ││
│  └─────────────────────────────────────────────────────────────┘│
│       │                                                          │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ Search Backends                                              ││
│  │                                                              ││
│  │  FTS5 Search           Vector Search                        ││
│  │  ┌────────────┐        ┌────────────┐                       ││
│  │  │ messages   │        │ embeddings │                       ││
│  │  │ _fts       │        │ table      │                       ││
│  │  └────────────┘        └────────────┘                       ││
│  │       │                     │                                ││
│  │       └─────────┬───────────┘                                ││
│  │                 │                                             ││
│  │                 ▼                                             ││
│  │           context.db                                          ││
│  └─────────────────────────────────────────────────────────────┘│
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Reciprocal Rank Fusion (RRF) Algorithm

RRF combines ranked lists from multiple search methods into a single ranking:

def rrf_fusion(
    fts_results: List[Dict],
    vector_results: List[Dict],
    k: int = 60,
    fts_weight: float = 0.4,
    vector_weight: float = 0.6
) -> List[Dict]:
    """
    Reciprocal Rank Fusion algorithm.

    RRF_score(d) = Σ (weight_i / (k + rank_i(d)))

    Args:
        fts_results: Ranked results from FTS5 search
        vector_results: Ranked results from vector search
        k: RRF constant (default 60, standard value)
        fts_weight: Weight for FTS5 results (0.0-1.0)
        vector_weight: Weight for vector results (0.0-1.0)

    Returns:
        Fused results sorted by combined RRF score
    """
    scores = {}

    # Add FTS5 contributions
    for rank, result in enumerate(fts_results, start=1):
        doc_id = result['id']
        scores[doc_id] = scores.get(doc_id, 0) + fts_weight / (k + rank)

    # Add vector contributions
    for rank, result in enumerate(vector_results, start=1):
        doc_id = result['id']
        scores[doc_id] = scores.get(doc_id, 0) + vector_weight / (k + rank)

    # Sort by combined score
    return sorted(
        [{'id': k, 'rrf_score': v} for k, v in scores.items()],
        key=lambda x: x['rrf_score'],
        reverse=True
    )

Why k=60?

Standard value used by Elastic, Cohere, and academic research
Prevents high ranks from dominating (rank 1 gets score 1/61, not 1/1)
Empirically validated across many retrieval tasks

Default Weights (FTS=0.4, Vector=0.6):

Vector search better for semantic/conceptual queries
FTS5 better for exact matches and technical terms
60/40 split favors semantic while preserving keyword precision

MCP Tool Definitions

@mcp_tool
def hybrid_search(
    query: str,
    limit: int = 20,
    fts_weight: float = 0.4,
    vector_weight: float = 0.6,
    vector_threshold: float = 0.3
) -> List[Dict[str, Any]]:
    """
    Hybrid search combining FTS5 keyword search and vector similarity.

    Uses Reciprocal Rank Fusion (RRF) to combine rankings.
    Best for general queries where you want both exact matches and semantic similarity.
    """

@mcp_tool
def semantic_search(
    query: str,
    limit: int = 20,
    threshold: float = 0.3
) -> List[Dict[str, Any]]:
    """
    Pure vector similarity search using embeddings.

    Best for conceptual queries where exact keywords may not be present.
    """

@mcp_tool
def keyword_search(
    query: str,
    limit: int = 20
) -> List[Dict[str, Any]]:
    """
    Pure FTS5 full-text search.

    Best for exact term matching, error messages, or technical identifiers.
    """

@mcp_tool
def search_decisions(
    query: str,
    limit: int = 10,
    decision_type: Optional[str] = None
) -> List[Dict[str, Any]]:
    """
    Search the decision knowledge base.

    Returns architectural and technical decisions with rationale.
    """

@mcp_tool
def search_errors(
    query: str,
    limit: int = 10
) -> List[Dict[str, Any]]:
    """
    Search the error-solution knowledge base.

    Returns error signatures and their verified solutions.
    """

@mcp_tool
def context_stats() -> Dict[str, Any]:
    """
    Get database statistics.

    Returns message counts, embedding coverage, knowledge base sizes.
    """

Configuration

MCP Server Registration (.mcp.json):

{
  "mcpServers": {
    "coditect-semantic-search": {
      "command": "python3",
      "args": [
        "/Users/halcasteel/.coditect/tools/mcp-semantic-search/server.py",
        "--mcp"
      ],
      "env": {
        "CONTEXT_DB_PATH": "~/PROJECTS/.coditect-data/context-storage/context.db"
      }
    }
  }
}

CLI Usage (non-MCP):

# Hybrid search (default)
python3 server.py "authentication error" --limit 10

# Semantic only
python3 server.py "authentication error" --mode semantic

# Keyword only
python3 server.py "authentication error" --mode keyword

# Custom weights
python3 server.py "authentication error" --fts-weight 0.3 --vector-weight 0.7

Performance Characteristics

Query Latency (143,743 messages, 100% embedding coverage):

Search Mode	Latency	Notes
FTS5 only	<50ms	SQLite native
Vector only	~200ms	Brute-force cosine similarity
Hybrid (RRF)	~300ms	Both + fusion

Memory Usage:

Embedding model: ~90MB (all-MiniLM-L6-v2)
Query processing: <50MB
Peak: ~150MB during embedding generation

Embedding Coverage:

Messages: 143,743 / 143,750 (99.99%)
Generated in: ~4 hours (batch processing)

Consequences

Positive Consequences

P1: MCP Integration

AI agents can now access semantic search programmatically
Claude Code can use search results for context injection
Enables autonomous knowledge retrieval

P2: Improved Search Quality

Hybrid search combines keyword precision + semantic recall
RRF fusion is proven algorithm with predictable behavior
Configurable weights allow tuning for specific use cases

P3: Minimal Disruption

Uses existing database and embeddings
No data migration required
Existing /cxq command remains functional

P4: Foundation for Future Work

H.5.5.2-H.5.5.4 can build on this infrastructure
Option C (AST chunking) remains viable future enhancement
MCP pattern can be applied to other tools

Negative Consequences

N1: Brute-Force Vector Search

O(n) complexity for vector similarity
Acceptable at current scale (143K messages)
May need ANN indexing at 500K+ messages

N2: No Real-Time Updates

Embeddings generated in batch
New messages require re-running /cx --with-embeddings
Acceptable for session-based workflow

N3: Single Model Dependency

Tied to all-MiniLM-L6-v2 embeddings
Model upgrade requires regenerating all embeddings
Mitigated by batch regeneration capability

Option B vs Option C: Value Analysis

Dimension	Option B (Selected)	Option C (Deferred)
Implementation Effort	6-8 hours	16-24 hours
Value Delivered	80%	100%
Risk	Low (uses existing infra)	Medium (migration required)
Breaking Changes	None	Yes (extraction pipeline)
MCP Integration	Yes	Yes
Code-Aware Search	No	Yes (AST chunking)
Scalability	~500K messages	Millions
Time to Value	Immediate	2-3 weeks

Conclusion: Option B is the right choice for H.5.5.1. Option C features can be added incrementally in H.5.5.2-H.5.5.4 without disrupting the foundation.

Competitive Analysis (January 2026)

Market Landscape

Research conducted January 2026 reveals that AST-based code analysis and call graphs are now table stakes in the AI coding assistant market.

Tool	AST Analysis	Call Graph	Impact Analysis	Memory	License
Code Pathfinder	Yes (tree-sitter)	Yes	Partial	No	AGPL-3.0
Cursor	Yes (tree-sitter)	Limited	No	No	Proprietary
JetBrains AI	Yes (native)	Yes	Yes	Limited	Proprietary
Augment Code	Yes	Yes	Limited	Limited	Proprietary
CODITECT	Planned	Planned	Planned	Unique	Proprietary

Code Pathfinder (Direct Competitor)

Repository: github.com/shivasurya/code-pathfinder

Code Pathfinder is an open-source (AGPL-3.0) MCP server offering:

Tree-sitter AST parsing (Java, Python, Dockerfile)
Call graph generation with 5-pass analysis
ANTLR-based query DSL
CI/CD integration with SARIF output

Key Technical Decisions (to learn from):

5 parallel workers for file parsing
SHA-256 node IDs for determinism
Lazy loading with byte offsets (reduces memory from 2.32GB to 2.18GB)
Object pooling for environment maps

CODITECT's True Moat

What competitors have (table stakes):

AST parsing
Call graph navigation
Basic code search

What CODITECT NOW has (unique):

Cross-session memory (ADR-020, ADR-021)
Decision tracking with rationale
Error-solution knowledge base
Hybrid RRF semantic search (H.5.5.1 - this ADR)
Memory-linked call graph (H.5.5.2 - COMPLETE)
AST parsing via tree-sitter (Python, JavaScript, TypeScript)

The unique combination:

╔═══════════════════════════════════════════════════════════════════════════╗
║                     CODITECT's True Moat                                  ║
╠═══════════════════════════════════════════════════════════════════════════╣
║  Competitors (Cursor, JetBrains, Code Pathfinder):                        ║
║    ✓ AST parsing           ← Table stakes                                 ║
║    ✓ Call graph            ← Table stakes                                 ║
║    ✗ Cross-session memory  ← Missing                                      ║
║    ✗ Decision awareness    ← Missing                                      ║
╠═══════════════════════════════════════════════════════════════════════════╣
║  CODITECT:                                                                ║
║    ✓ AST parsing           ← Now implemented (tree-sitter)                ║
║    ✓ Call graph            ← Now implemented (H.5.5.2)                    ║
║    ✓ Cross-session memory  ← Unique (memory_linked_search)                ║
║    ✓ Decision awareness    ← Unique (ADR integration)                     ║
╚═══════════════════════════════════════════════════════════════════════════╝

No competitor can answer:

"What did I change last session that might cause this error?"
"Which architectural decisions constrain this refactoring?"
"Show me all times I've fixed this type of error"
"When was this function last discussed and what decisions affected it?" (H.5.5.2)

Strategic Implications for H.5.5.x

Task	Original Goal	Revised Goal (Memory-Aware)	Status
H.5.5.1	Semantic search	Hybrid RRF fusion	✅ COMPLETE
H.5.5.2	Call graph navigation	Memory-linked call graph	✅ COMPLETE
H.5.5.3	Impact analysis	Decision-aware impact	Pending
H.5.5.4	Document RAG	Cross-session doc retrieval	Pending

H.5.5.2 Implementation Details (Completed Jan 17, 2026)

MCP Call Graph Server: tools/mcp-call-graph/server.py (700+ lines)

Tool	Description
`index_file`	Index source file into call graph
`index_directory`	Batch index directory
`get_callers`	Find functions that call a given function
`get_callees`	Find functions called by a given function
`call_chain`	Find call paths between functions
`memory_linked_search`	CODITECT UNIQUE - call graph with memory context
`call_graph_stats`	Database statistics

Current Index Stats:

Functions: 5,590
Call Edges: 55,548
Files Indexed: 441

Database Schema:

call_graph_functions  -- Functions with signatures, docstrings
call_graph_edges      -- Call relationships
call_graph_memory     -- Links to session messages (CODITECT UNIQUE)

Tests: 22 passing (tests/tools/test_mcp_call_graph.py)

Research References

Full academic research and competitive analysis available in:

CODE-INTELLIGENCE-RESEARCH.md

Key papers informing this decision:

Cormack et al. (2009) - RRF algorithm foundation
Microsoft GraphRAG (2024) - Knowledge graph + RAG
TAILOR (2023) - Code Property Graph learning

Files Created/Modified

File	Purpose
`tools/mcp-semantic-search/server.py`	MCP server implementation (500 lines)
`tools/mcp-semantic-search/CLAUDE.md`	Tool documentation
`tools/mcp-semantic-search/requirements.txt`	Dependencies
`tests/tools/test_mcp_semantic_search.py`	Unit tests (26 tests)
`.mcp.json` (rollout-master)	MCP server registration

Validation

Test Coverage

tests/tools/test_mcp_semantic_search.py - 26 tests
├── TestEmbeddingUtils (5 tests)
├── TestFTS5Search (4 tests)
├── TestVectorSearch (3 tests)
├── TestHybridSearch (5 tests)
├── TestDecisionSearch (2 tests)
├── TestErrorSearch (2 tests)
├── TestDatabaseStats (3 tests)
└── TestRRFScoring (2 tests)

Acceptance Criteria

MCP server starts and registers tools
hybrid_search returns fused results
semantic_search returns vector-only results
keyword_search returns FTS5-only results
search_decisions queries decision knowledge base
search_errors queries error-solution pairs
context_stats returns database metrics
CLI mode works for testing
All 26 unit tests passing

Indexing Strategy (v1.3.0)

Problem: Duplicate Data on Reindex

When reindexing files, naive append strategies cause data duplication:

Functions: INSERT OR REPLACE works (upserts by key)
Edges: INSERT appends, causing duplicates on each reindex

Solution: Delete+Insert Strategy

Per dbt incremental strategies and Microsoft best practices:

# Before inserting edges for a file, delete old ones
cursor.execute("""
    DELETE FROM call_graph_edges
    WHERE caller_id IN (
        SELECT node_id FROM call_graph_functions
        WHERE file_path = ?
    )
""", (file_path,))
# Then INSERT new edges

Indexing Strategy Matrix

Operation	Strategy	Rationale
Initial Index	Full build	Complete scan, build all indexes
File Changed	Delete+Insert	Remove old edges, add new for that file
Periodic Maintenance	VACUUM if fragmentation >30%	SQLite optimization
Full Rebuild	`--full-refresh` flag	Drop and recreate tables

Tree-sitter Incremental Parsing

Tree-sitter provides native incremental parsing:

Shares unchanged portions of syntax tree
Only re-parses edited portions
Future optimization: track file edits, not full re-parse

Implementation Status

Component	Strategy	Status
Messages (context.db)	`INSERT OR IGNORE` (by hash)	✅ Correct
Functions (call_graph)	`INSERT OR REPLACE` (by node_id)	✅ Correct
Edges (call_graph)	Delete+Insert (by file)	✅ Fixed in v1.3.0
Embeddings	Skip if exists (by message_id)	✅ Correct

References

Documentation

Research

Reciprocal Rank Fusion: Cormack et al., 2009 - "Reciprocal Rank Fusion Outperforms Condorcet and Individual Rank Learning Methods"
Hybrid Search: Karpukhin et al., 2020 - "Dense Passage Retrieval for Open-Domain Question Answering"
MCP Protocol: Anthropic Model Context Protocol

Elastic Hybrid Search (uses RRF)
Cohere Rerank (uses RRF fusion)
Pinecone Hybrid Search

Status: ACCEPTED Decision Date: 2026-01-17 Implementation Status: COMPLETE Task ID: H.5.5.1 Maintainer: CODITECT Core Team Review Date: 2026-04-17 (quarterly review)

Changelog

v1.3.0 (2026-01-17)

Added Indexing Strategy section documenting delete+insert approach
Fixed duplicate edges bug in call graph reindex (commit 63582c71)
Documented Tree-sitter incremental parsing for future optimization
Added references to dbt and Microsoft best practices

v1.2.0 (2026-01-17)

H.5.5.2 COMPLETE: Memory-linked call graph MCP server implemented
Added competitive position matrix showing CODITECT's moat
Updated strategic implications table with implementation status
Added H.5.5.2 implementation details (5,590 functions, 55,548 edges, 441 files)
Tree-sitter AST parsing now implemented (Python, JavaScript, TypeScript)

v1.1.0 (2026-01-17)

Added comprehensive competitive analysis section
Analyzed Code Pathfinder as direct competitor
Identified CODITECT's unique competitive moat (Memory + Decisions)
Updated H.5.5.x strategic recommendations
Added reference to CODE-INTELLIGENCE-RESEARCH.md

v1.0.0 (2026-01-17)

Initial ADR documenting MCP semantic search with hybrid RRF fusion
Documented Option A/B/C analysis and selection rationale
Defined MCP tool interfaces
Documented RRF algorithm implementation
Added performance characteristics and validation criteria

Context and Problem Statement​

Background​

Problem Statement​

Requirements​

Decision Drivers​

Technical Constraints​

Performance Goals​

User Experience​

Considered Options​

Option A: ChromaDB Migration (Full Rewrite)​

Option B: Enhanced Hybrid Search (RRF Fusion) - SELECTED​

Option C: Full AST-Based Chunking + Dedicated Vector Store​

Decision Outcome​

Rationale​

Future Path​

Technical Implementation​

Architecture​

Reciprocal Rank Fusion (RRF) Algorithm​

MCP Tool Definitions​

Configuration​

Performance Characteristics​

Consequences​

Positive Consequences​

Negative Consequences​

Option B vs Option C: Value Analysis​

Competitive Analysis (January 2026)​

Market Landscape​

Code Pathfinder (Direct Competitor)​

CODITECT's True Moat​

Strategic Implications for H.5.5.x​

H.5.5.2 Implementation Details (Completed Jan 17, 2026)​

Research References​

Files Created/Modified​

Validation​

Test Coverage​

Acceptance Criteria​

Indexing Strategy (v1.3.0)​

Problem: Duplicate Data on Reindex​

Solution: Delete+Insert Strategy​

Indexing Strategy Matrix​

Tree-sitter Incremental Parsing​

Implementation Status​

References​

Documentation​

Research​

Related Systems​

Changelog​

v1.3.0 (2026-01-17)​

v1.2.0 (2026-01-17)​

v1.1.0 (2026-01-17)​

v1.0.0 (2026-01-17)​

Context and Problem Statement

Background

Problem Statement

Requirements

Decision Drivers

Technical Constraints

Performance Goals

User Experience

Considered Options

Option A: ChromaDB Migration (Full Rewrite)

Option B: Enhanced Hybrid Search (RRF Fusion) - SELECTED

Option C: Full AST-Based Chunking + Dedicated Vector Store

Decision Outcome

Rationale

Future Path

Technical Implementation

Architecture

Reciprocal Rank Fusion (RRF) Algorithm

MCP Tool Definitions

Configuration

Performance Characteristics

Consequences

Positive Consequences

Negative Consequences

Option B vs Option C: Value Analysis

Competitive Analysis (January 2026)

Market Landscape

Code Pathfinder (Direct Competitor)

CODITECT's True Moat

Strategic Implications for H.5.5.x

H.5.5.2 Implementation Details (Completed Jan 17, 2026)

Research References

Files Created/Modified

Validation

Test Coverage

Acceptance Criteria

Indexing Strategy (v1.3.0)

Problem: Duplicate Data on Reindex

Solution: Delete+Insert Strategy

Indexing Strategy Matrix

Tree-sitter Incremental Parsing

Implementation Status

References

Documentation

Research

Related Systems

Changelog

v1.3.0 (2026-01-17)

v1.2.0 (2026-01-17)

v1.1.0 (2026-01-17)

v1.0.0 (2026-01-17)