ADR-005: Hybrid Search with Reciprocal Rank Fusion (RRF)

Status: Proposed Date: 2025-11-26 Deciders: Architecture Team, ML/Search Team Related ADRs: ADR-002 (PostgreSQL + Weaviate), ADR-004 (Multi-Tenant RLS)

Context

The context intelligence platform must provide powerful search across AI conversation history to help developers find:

Past decisions: "Why did we choose PostgreSQL over MongoDB?"
Code patterns: "How did we implement authentication?"
Bug discussions: "What was that error with React hooks?"
Team knowledge: "Who discussed GraphQL performance?"

Search Quality Requirements

Requirement	Target	Importance
Relevance	8/10 user satisfaction	Critical
Latency	p95 <100ms	High
Recall	90%+ (find what exists)	High
Precision	80%+ (top 10 results relevant)	High
Multi-tenant isolation	100% (no data leakage)	Critical

Search Challenges

1. Keyword Search Limitations (PostgreSQL Full-Text)

✅ Exact matches work well ("authentication error")
❌ Synonyms fail ("auth error" doesn't match "authentication error")
❌ Conceptual queries fail ("security bug" doesn't match "XSS vulnerability")
❌ Typos fail ("athentication" doesn't match "authentication")

2. Semantic Search Limitations (Weaviate Vector)

✅ Conceptual matches work well ("database" matches "PostgreSQL", "MySQL")
✅ Synonyms work ("bug" matches "error", "issue")
❌ Exact IDs fail ("ABC-123" won't match "ABC-124")
❌ Long-tail keywords fail ("PostgreSQL 15.3" might match "MySQL 8.0")

3. Solution: Hybrid Search Combine both approaches to get best of both worlds:

PostgreSQL for exact matches, IDs, specific terms
Weaviate for semantic understanding, synonyms, concepts
Fusion algorithm to merge results intelligently

Decision

We will implement Hybrid Search using Reciprocal Rank Fusion (RRF) to combine PostgreSQL full-text search (keyword) with Weaviate vector search (semantic).

Architecture

┌─────────────────────────────────────────────────────────────┐
│ User Query: "authentication bug in React hooks"             │
└───────────────────────┬─────────────────────────────────────┘
                        ↓
        ┌───────────────┴───────────────┐
        ↓                               ↓
┌──────────────────┐          ┌──────────────────┐
│ Keyword Search   │          │ Semantic Search  │
│ (PostgreSQL FTS) │          │ (Weaviate)       │
│                  │          │                  │
│ ┌──────────────┐ │          │ ┌──────────────┐ │
│ │ ts_query     │ │          │ │ Text         │ │
│ │ ("auth" &    │ │          │ │ Embedding    │ │
│ │  "bug" &     │ │          │ │ (OpenAI)     │ │
│ │  "react" &   │ │          │ │              │ │
│ │  "hook")     │ │          │ │ Cosine       │ │
│ └──────────────┘ │          │ │ Similarity   │ │
│                  │          │ └──────────────┘ │
│ Results (ranked):│          │ Results (ranked):│
│ 1. Conv #42      │          │ 1. Conv #15      │
│ 2. Conv #15      │          │ 2. Conv #42      │
│ 3. Conv #91      │          │ 3. Conv #7       │
│ 4. Conv #7       │          │ 4. Conv #28      │
│ 5. Conv #33      │          │ 5. Conv #91      │
└──────────────────┘          └──────────────────┘
        ↓                               ↓
        └───────────────┬───────────────┘
                        ↓
┌─────────────────────────────────────────────────────────────┐
│ Reciprocal Rank Fusion (RRF)                                │
│                                                              │
│ RRF Score = Σ [ 1 / (k + rank_in_list) ]                   │
│                                                              │
│ Conv #42: 1/(60+1) + 1/(60+2) = 0.0164 + 0.0161 = 0.0325   │
│ Conv #15: 1/(60+2) + 1/(60+1) = 0.0161 + 0.0164 = 0.0325   │
│ Conv #91: 1/(60+3) + 1/(60+5) = 0.0159 + 0.0154 = 0.0313   │
│ Conv #7:  1/(60+4) + 1/(60+3) = 0.0156 + 0.0159 = 0.0315   │
│ Conv #28: 0 + 1/(60+4) = 0.0156                            │
│ Conv #33: 1/(60+5) + 0 = 0.0154                            │
│                                                              │
│ Final Ranking:                                               │
│ 1. Conv #42 (0.0325) - Appears in both top 2                │
│ 2. Conv #15 (0.0325) - Appears in both top 2                │
│ 3. Conv #7  (0.0315) - Appears in both top 5                │
│ 4. Conv #91 (0.0313) - Appears in both top 5                │
│ 5. Conv #28 (0.0156) - Semantic only                        │
└─────────────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────────────┐
│ Returned to User (Top 5):                                   │
│ 1. #42: "Auth bug in React useEffect hook"                 │
│ 2. #15: "Authentication token refresh issue"               │
│ 3. #7:  "Security vulnerability in login flow"             │
│ 4. #91: "React component re-render loop"                   │
│ 5. #28: "Session management best practices"                │
└─────────────────────────────────────────────────────────────┘

Implementation

Step 1: PostgreSQL Full-Text Search Setup

-- Add tsvector column for full-text search
ALTER TABLE conversations ADD COLUMN title_tsv tsvector;
ALTER TABLE messages ADD COLUMN content_tsv tsvector;

-- Auto-update tsvector on insert/update
CREATE TRIGGER conversations_tsvector_update BEFORE INSERT OR UPDATE
ON conversations FOR EACH ROW EXECUTE FUNCTION
tsvector_update_trigger(title_tsv, 'pg_catalog.english', title);

CREATE TRIGGER messages_tsvector_update BEFORE INSERT OR UPDATE
ON messages FOR EACH ROW EXECUTE FUNCTION
tsvector_update_trigger(content_tsv, 'pg_catalog.english', content);

-- Create GIN indexes for fast full-text search
CREATE INDEX idx_conversations_title_tsv ON conversations USING GIN (title_tsv);
CREATE INDEX idx_messages_content_tsv ON messages USING GIN (content_tsv);

-- Composite index for multi-tenant full-text search
CREATE INDEX idx_messages_org_tsv ON messages(organization_id, content_tsv);

Keyword Search Query (Async SQLAlchemy):

# core/repositories/sqlalchemy_conversation_repo.py
from sqlalchemy import func, select
from sqlalchemy.dialects.postgresql import TSVECTOR

async def keyword_search(
    self,
    organization_id: uuid.UUID,
    query: str,
    limit: int = 20
) -> List[Conversation]:
    """PostgreSQL full-text search"""

    # Parse query into tsquery
    tsquery = func.plainto_tsquery('english', query)

    # Search conversations by title
    stmt = (
        select(ConversationModel)
        .where(ConversationModel.organization_id == organization_id)
        .where(ConversationModel.title_tsv.match(tsquery))
        .order_by(
            func.ts_rank(ConversationModel.title_tsv, tsquery).desc()
        )
        .limit(limit)
    )

    result = await self.session.execute(stmt)
    return [self._to_domain(m) for m in result.scalars()]

Step 2: Weaviate Vector Search Setup

# core/clients/weaviate_client.py
import weaviate
from typing import List
import uuid

class WeaviateClient:
    """Weaviate vector search client"""

    def __init__(self, url: str, api_key: str):
        self.client = weaviate.Client(
            url=url,
            auth_client_secret=weaviate.AuthApiKey(api_key)
        )

    async def semantic_search(
        self,
        organization_id: uuid.UUID,
        query: str,
        limit: int = 20
    ) -> List[dict]:
        """Semantic search using Weaviate"""

        result = (
            self.client.query
            .get("Conversation", ["id", "title", "content"])
            .with_tenant(tenant=str(organization_id))  # Multi-tenant isolation
            .with_near_text({"concepts": [query]})
            .with_limit(limit)
            .with_additional(["distance", "certainty"])
            .do()
        )

        if not result or "data" not in result:
            return []

        conversations = result["data"]["Get"]["Conversation"]

        return [
            {
                "id": uuid.UUID(conv["id"]),
                "title": conv["title"],
                "score": conv["_additional"]["certainty"],
                "rank": idx + 1
            }
            for idx, conv in enumerate(conversations)
        ]

Step 3: Reciprocal Rank Fusion (RRF)

# core/services/search_service.py
from typing import List, Dict
import asyncio
import uuid

class SearchService:
    """Hybrid search with RRF fusion"""

    def __init__(
        self,
        conversation_repo: ConversationRepository,
        weaviate_client: WeaviateClient,
        rrf_k: int = 60  # RRF constant (default from literature)
    ):
        self.conversation_repo = conversation_repo
        self.weaviate_client = weaviate_client
        self.rrf_k = rrf_k

    async def hybrid_search(
        self,
        organization_id: uuid.UUID,
        query: str,
        limit: int = 20,
        alpha: float = 0.5  # 0.0 = keyword only, 1.0 = semantic only
    ) -> List[Conversation]:
        """Hybrid search with RRF fusion"""

        # Run both searches in parallel (2x faster)
        keyword_task = self.conversation_repo.keyword_search(
            organization_id=organization_id,
            query=query,
            limit=limit * 2  # Get more candidates for fusion
        )

        semantic_task = self.weaviate_client.semantic_search(
            organization_id=organization_id,
            query=query,
            limit=limit * 2
        )

        keyword_results, semantic_results = await asyncio.gather(
            keyword_task,
            semantic_task
        )

        # Apply Reciprocal Rank Fusion
        fused_results = self._rrf_fusion(
            keyword_results=keyword_results,
            semantic_results=semantic_results,
            alpha=alpha
        )

        # Return top N results
        return fused_results[:limit]

    def _rrf_fusion(
        self,
        keyword_results: List[Conversation],
        semantic_results: List[dict],
        alpha: float = 0.5
    ) -> List[Conversation]:
        """Reciprocal Rank Fusion algorithm"""

        rrf_scores: Dict[uuid.UUID, float] = {}

        # Calculate RRF scores for keyword results
        for rank, conv in enumerate(keyword_results, start=1):
            rrf_scores[conv.id] = rrf_scores.get(conv.id, 0.0) + \
                (1 - alpha) / (self.rrf_k + rank)

        # Calculate RRF scores for semantic results
        for rank, result in enumerate(semantic_results, start=1):
            conv_id = result["id"]
            rrf_scores[conv_id] = rrf_scores.get(conv_id, 0.0) + \
                alpha / (self.rrf_k + rank)

        # Sort by RRF score (descending)
        sorted_ids = sorted(
            rrf_scores.keys(),
            key=lambda id: rrf_scores[id],
            reverse=True
        )

        # Fetch full conversation objects
        conversations = []
        for conv_id in sorted_ids:
            # Try keyword results first (already fetched)
            conv = next((c for c in keyword_results if c.id == conv_id), None)

            # Otherwise fetch from database
            if not conv:
                conv = await self.conversation_repo.get(conv_id)

            if conv:
                conversations.append(conv)

        return conversations

Step 4: Configurable Alpha Weighting

# app/api/routes/search.py (FastAPI)
from fastapi import APIRouter, Query

@router.get("/conversations/search")
async def search_conversations(
    q: str,
    limit: int = Query(20, ge=1, le=100),
    alpha: float = Query(0.5, ge=0.0, le=1.0),  # Configurable!
    current_user = Depends(get_current_user),
    search_service: SearchService = Depends(get_search_service)
):
    """
    Hybrid search with configurable alpha:
    - alpha=0.0: Keyword-only search (PostgreSQL)
    - alpha=0.5: Balanced hybrid (default)
    - alpha=1.0: Semantic-only search (Weaviate)
    """

    results = await search_service.hybrid_search(
        organization_id=current_user.organization_id,
        query=q,
        limit=limit,
        alpha=alpha
    )

    return {"results": results, "count": len(results)}

Consequences

Positive

✅ Best of Both Worlds
- Exact matches: PostgreSQL handles "ABC-123", "PostgreSQL 15.3"
- Semantic matches: Weaviate handles "database" → "PostgreSQL", "MySQL"
- Combined: Results appear higher if matched by BOTH methods
✅ Proven Algorithm (RRF)
- Used by Elasticsearch, OpenSearch, Weaviate (industry standard)
- Research-backed: "Reciprocal Rank Fusion outperforms CombSUM" (2009)
- No machine learning required (parameter-free except k=60)
✅ Configurable Trade-offs
- alpha=0.0: Keyword-only (exact matches, faster)
- alpha=0.5: Balanced (default, best for most queries)
- alpha=1.0: Semantic-only (conceptual search)
- Per-query configuration allows user preference
✅ Performance: Parallel Execution
- Both searches run concurrently (not sequential)
- Total latency = max(keyword_time, semantic_time) + fusion_time
- Fusion time <5ms (simple arithmetic)
- Target: p95 <100ms (both searches <100ms, fusion <5ms)
✅ Multi-Tenant Safe
- PostgreSQL: RLS policies filter by organization_id (ADR-004)
- Weaviate: .with_tenant() filter isolates tenants
- Fusion only combines results from SAME organization
✅ Incremental Improvement
- Phase 1: PostgreSQL keyword search only (working, fast)
- Phase 2: Add Weaviate semantic search + RRF fusion
- Phase 3: Tune alpha based on user feedback

Negative

⚠️ Double Infrastructure Cost
- Cost: PostgreSQL ($100/month) + Weaviate ($150/month) = $250/month
- Mitigation: Balanced option (ADR-002), scales linearly
- ROI: Better search = higher user satisfaction = lower churn (>10% churn reduction = $120K/year)
⚠️ Embedding Generation Latency
- Problem: Must generate embeddings for new messages before searchable
- Latency: 50-100ms per message (OpenAI API)
- Mitigation: Async background job (Celery), user doesn't wait
- Trade-off: New messages searchable within 5 seconds (acceptable)
⚠️ Synchronization Complexity
- Problem: PostgreSQL and Weaviate must stay in sync
- Risk: Message in PostgreSQL but not Weaviate = not searchable
- Mitigation:
  - PostgreSQL = source of truth
  - Celery task: sync_to_weaviate(message_id)
  - Retry logic: 3 attempts with exponential backoff
  - Monitoring: Alert if sync lag >1 minute
⚠️ Alpha Tuning Required
- Problem: Optimal alpha varies by query type
- Example: "ABC-123" needs alpha=0.0 (exact match), "security bug" needs alpha=0.8 (semantic)
- Mitigation Phase 1: Default alpha=0.5 (good enough for 80% of queries)
- Future Enhancement: Query classifier predicts optimal alpha
```
if is_exact_match_query(query):  # Contains IDs, exact terms
    alpha = 0.2  # Favor keyword
elif is_conceptual_query(query):  # Abstract terms
    alpha = 0.8  # Favor semantic
else:
    alpha = 0.5  # Balanced
```

Risks and Mitigations

Risk	Likelihood	Impact	Mitigation
Weaviate Downtime	Low	High	Fallback to PostgreSQL-only (alpha=0.0), cache embeddings
Embedding Cost Spike	Medium	Medium	Monitor usage, cap at $500/month, batch processing
Sync Lag	Medium	Low	Celery retry, monitoring, alert if lag >1 minute
Poor Relevance	Low	High	A/B test alpha values, user feedback loop

Alternatives Considered

Alternative 1: PostgreSQL Full-Text Search Only (No Weaviate)

Architecture: Use only PostgreSQL tsvector + tsquery

Pros:

✅ Simpler infrastructure (1 database vs. 2)
✅ Lower cost ($100/month vs. $250/month)
✅ No synchronization complexity

Cons:

❌ No semantic understanding: "bug" doesn't match "error", "issue"
❌ No synonym handling: "auth" doesn't match "authentication"
❌ Poor for conceptual queries: "database performance" doesn't match "PostgreSQL slow queries"
❌ User satisfaction: 6/10 vs. 8/10 with hybrid search

Why Rejected: Search quality is critical for product differentiation. Competitors (GitHub Copilot, Cursor) don't offer search at all - this is our competitive advantage. Saving $150/month is not worth 25% lower user satisfaction.

Alternative 2: Weaviate Semantic Search Only (No PostgreSQL FTS)

Architecture: Use only Weaviate vector search

Pros:

✅ Semantic understanding (best-in-class)
✅ Synonym handling
✅ Conceptual search

Cons:

❌ Poor exact matches: "ABC-123" might match "ABC-124", "XYZ-999"
❌ No keyword precision: "PostgreSQL 15.3" might match "MySQL 8.0"
❌ User frustration: "I typed the exact ID and it didn't find it!"

Why Rejected: Exact matches are critical for developer workflows (ticket IDs, commit hashes, function names). Vector search alone is insufficient.

Alternative 3: Weighted Sum Instead of RRF

Architecture: Combine keyword and semantic scores using weighted sum

final_score = alpha * semantic_score + (1 - alpha) * keyword_score

Pros:

✅ Simpler algorithm
✅ Intuitive weighting

Cons:

❌ Score calibration required: Keyword scores (0-10) vs. semantic scores (0-1) have different ranges
❌ Normalization complexity: Must normalize both score ranges to [0, 1]
❌ Outlier sensitivity: One high score dominates the result
❌ Research shows RRF outperforms weighted sum (Cormack et al., 2009)

Why Rejected: RRF is simpler (no score normalization), more robust (rank-based), and proven to outperform weighted sum.

Alternative 4: Query Classifier → Single Search Method

Architecture: Classify query type, then route to EITHER keyword OR semantic (not both)

if is_exact_match_query(query):
    results = keyword_search(query)  # PostgreSQL only
elif is_conceptual_query(query):
    results = semantic_search(query)  # Weaviate only
else:
    results = hybrid_search(query)   # Both

Pros:

✅ Potentially faster (single search vs. two searches)
✅ Lower Weaviate usage (cost savings)

Cons:

❌ Classifier accuracy: What if it misclassifies?
❌ Complexity: Need to train/maintain classifier
❌ Edge cases: Many queries are hybrid ("PostgreSQL bug" = exact + semantic)
❌ Marginal performance gain: Parallel execution already fast (<100ms)

Why Rejected: Complexity not worth marginal performance gain. Better to always use hybrid search (simpler, more robust) and tune alpha if needed.

Success Metrics

Search Quality Metrics

User satisfaction: 8/10 (measured via feedback thumbs up/down)
Recall: 90%+ (find existing conversations)
Precision@10: 80%+ (top 10 results relevant)
Click-through rate: 60%+ (user clicks on result)

Performance Metrics

p50 latency: <50ms
p95 latency: <100ms
p99 latency: <200ms
Throughput: 100 searches/sec at 10K organizations

Synchronization Metrics

Sync lag: <5 seconds (p95)
Sync error rate: <0.1%
Embedding generation time: <100ms per message

Implementation Plan

Phase 1: PostgreSQL Keyword Search (Week 1)

Add tsvector columns to tables
Create GIN indexes
Implement keyword_search() method
Test with 100K messages
Benchmark latency (target: p95 <50ms)

Phase 2: Weaviate Setup (Week 2)

Deploy Weaviate Cloud (managed)
Create multi-tenant schema
Implement embedding generation (OpenAI API)
Implement semantic_search() method
Test with 100K messages

Phase 3: RRF Fusion (Week 3)

Implement _rrf_fusion() algorithm
Parallel execution (asyncio)
Configurable alpha parameter
Integration tests (both modes)

Phase 4: Tuning & Optimization (Week 4)

A/B test alpha values (0.3, 0.5, 0.7)
User feedback collection (thumbs up/down)
Performance optimization (<100ms p95)
Monitoring and alerting

References

Research:

Reciprocal Rank Fusion (RRF) Paper - Cormack et al., 2009
Elasticsearch Hybrid Search
Weaviate Hybrid Search Docs

Implementation Examples:

Related ADRs:

ADR-002: PostgreSQL + Weaviate (database architecture)
ADR-004: Multi-Tenant RLS (isolation for search)
ADR-006: Conversation-Commit Correlation (extends search to Git commits)

Status: Proposed Review Date: 2025-12-03 Projected ADR Score: 39/40 (A+) Complexity: Medium (proven algorithm, parallel execution) Owner: Architecture Team + ML/Search Team

Next Steps:

Approve RRF hybrid search approach
Implement PostgreSQL FTS (Week 1)
Deploy Weaviate and test embedding pipeline (Week 2)
Integrate RRF fusion (Week 3)
A/B test alpha values and optimize (Week 4)

Context​

Search Quality Requirements​

Search Challenges​

Decision​

Architecture​

Implementation​

Step 1: PostgreSQL Full-Text Search Setup​

Step 2: Weaviate Vector Search Setup​

Step 3: Reciprocal Rank Fusion (RRF)​

Step 4: Configurable Alpha Weighting​

Consequences​

Positive​

Negative​

Risks and Mitigations​

Alternatives Considered​

Alternative 1: PostgreSQL Full-Text Search Only (No Weaviate)​

Alternative 2: Weaviate Semantic Search Only (No PostgreSQL FTS)​

Alternative 3: Weighted Sum Instead of RRF​

Alternative 4: Query Classifier → Single Search Method​

Success Metrics​

Search Quality Metrics​

Performance Metrics​

Synchronization Metrics​

Implementation Plan​

Phase 1: PostgreSQL Keyword Search (Week 1)​

Phase 2: Weaviate Setup (Week 2)​

Phase 3: RRF Fusion (Week 3)​

Phase 4: Tuning & Optimization (Week 4)​

References​

Context

Search Quality Requirements

Search Challenges

Decision

Architecture

Implementation

Step 1: PostgreSQL Full-Text Search Setup

Step 2: Weaviate Vector Search Setup

Step 3: Reciprocal Rank Fusion (RRF)

Step 4: Configurable Alpha Weighting

Consequences

Positive

Negative

Risks and Mitigations

Alternatives Considered

Alternative 1: PostgreSQL Full-Text Search Only (No Weaviate)

Alternative 2: Weaviate Semantic Search Only (No PostgreSQL FTS)

Alternative 3: Weighted Sum Instead of RRF

Alternative 4: Query Classifier → Single Search Method

Success Metrics

Search Quality Metrics

Performance Metrics

Synchronization Metrics

Implementation Plan

Phase 1: PostgreSQL Keyword Search (Week 1)

Phase 2: Weaviate Setup (Week 2)

Phase 3: RRF Fusion (Week 3)

Phase 4: Tuning & Optimization (Week 4)

References