Demonstrating the Power of /semantic-search
A Real-World Example: Tracing 2FA Backup Codes Across 650K+ Messages
| Field | Value |
|---|---|
| Date | 2026-02-16 |
| Task | J.4.1 (Semantic Search Over Messages) |
| Project | PILOT |
| Author | Claude (Opus 4.6) |
| ADR | ADR-080 (Semantic Search), ADR-016 (Two-Factor Authentication) |
Background
A user found backup codes with only a timestamp:
Generated: 2026-01-02T14:30:10.611Z
No context. No label. No file path. Just codes and a date. The goal: understand what they are, why they exist, and where the architecture is documented -- using only /semantic-search against 650K+ session messages stored in sessions.db.
Step 1: Initial Hybrid Search
Query:
/semantic-search "CODITECT Backup Codes" --hybrid
Mode: Hybrid (60% semantic similarity + 40% FTS5 keyword matching via Reciprocal Rank Fusion)
Time: 25.33 seconds | Results: 10 matches
What Came Back
| Rank | Similarity | Content | Insight |
|---|---|---|---|
| 1 | 0.7584 | "what is in .coditect.backup?" | Backup infrastructure -- not 2FA codes |
| 2 | 0.7114 | Backup script coverage discussion | Same -- infrastructure backups |
| 3 | 0.6789 | "are we backing up ~/.coditect-data?" | Same topic cluster |
| 4 | 0.6667 | ADR-107 Comprehensive Backup Strategy commit | Backup strategy, not 2FA |
| 5-10 | 0.62-0.66 | GCS backups, recovery, backup-status.py | Infrastructure cluster |
Analysis
The initial search surfaced infrastructure backup discussions, not 2FA backup codes. This is expected -- "backup codes" is semantically closer to "backup strategy" than to "two-factor authentication" in the embedding space. The term "backup" dominated the semantic similarity.
Key Learning: When a query contains ambiguous terms, the first search establishes the semantic neighborhood. The results tell you what the corpus thinks you mean.
Step 2: Pivot to Keyword Mode
Recognizing that the semantic space was pulling toward infrastructure backups, the search strategy pivoted to exact phrase matching.
Query:
/semantic-search '"backup codes"' --keyword
Mode: Keyword (FTS5 BM25 ranking) | Time: 0.02 seconds (instant)
What Came Back
| Rank | Date | Content | Insight |
|---|---|---|---|
| 1 | 2026-01-02 14:34 | 'backup_codes': backup_codes in Response | The 2FA views.py code |
| 2 | 2026-01-02 14:34 | Two-Factor Authentication views docstring | The 2FA module |
| 3 | 2026-01-04 | PILOT-PARALLEL-EXECUTION-PLAN.md update | Planning doc referencing 2FA |
| 4 | 2026-01-04 | urls.py 2FA route additions | URL routing for 2FA endpoints |
| 5 | 2026-01-02 14:13 | "View Backup Codes" and "Disable 2FA" buttons | Frontend UI work |
| 6 | 2026-01-02 13:26 | "Deployed v1.14.0-2fa" | The deployment that created the codes |
| 7-10 | 2026-01-04 | Profile.tsx, urls.py updates | Frontend + backend integration |
Analysis
Keyword mode instantly pinpointed the exact implementation session from January 2, 2026. The BM25 ranking correctly prioritized messages where "backup codes" appeared as a distinct technical term in code context, not as general prose.
Key Learning: When you know the exact term, keyword mode is 1000x faster (0.02s vs 25s) and more precise than semantic search. Use semantic for concept discovery, keyword for term lookup.
Step 3: Codebase Verification
With the session messages pointing to views_2fa.py, a direct codebase search confirmed the implementation:
Key findings from users/views_2fa.py:
| Function/Endpoint | Purpose |
|---|---|
generate_backup_codes(10) | Creates 10 cryptographically secure codes |
GET /api/v1/auth/2fa/backup-codes/ | Retrieve remaining codes |
POST /api/v1/auth/2fa/regenerate-backup-codes/ | Invalidate old + generate new |
backup_codes.pop(i) | One-time use -- consumed on login |
Step 4: ADR Discovery
Searching the ADR directory for 2FA-related decisions:
Pattern: "2FA|two.factor|backup.code|authentication.*factor"
Path: internal/architecture/adrs/
Found: cloud-platform/adr-016-two-factor-authentication.md
ADR-016 Key Points
- Decision: Custom TOTP (pyotp) over Firebase MFA, Auth0 Guardian, Duo Security
- Backup codes: 10 per user, SHA-256 hashed, one-time use
- Rationale: Full control, no external dependencies, $0/verification (vs $0.0075/SMS)
- Date: 2026-01-02 (same day as the backup code generation timestamp)
The Complete Picture
Starting from nothing but a timestamp and mysterious codes, /semantic-search enabled reconstruction of the full story in under 2 minutes:
Timestamp: 2026-01-02T14:30:10.611Z
|
v
[Hybrid Search] --> Infrastructure backup cluster (wrong direction)
|
v (pivot strategy)
[Keyword Search] --> 2FA implementation session (0.02s, exact match)
|
v
[Codebase Search] --> views_2fa.py: generate_backup_codes(10)
|
v
[ADR Search] --> ADR-016: Two-Factor Authentication
|
v
ANSWER: 2FA recovery codes, 10 one-time-use,
for account recovery when authenticator lost,
generated during v1.14.0-2fa deployment
Search Mode Comparison (This Example)
| Mode | Query | Time | Result Quality | Role |
|---|---|---|---|---|
| Hybrid | "CODITECT Backup Codes" | 25.3s | Wrong cluster (infrastructure) | Established what "backup" means to the corpus |
| Keyword | "backup codes" | 0.02s | Exact match (2FA code) | Pinpointed the implementation session |
| Semantic | (not used alone) | ~16s | Would match concept | Better for open-ended exploration |
Lessons for /semantic-search Users
1. Start Broad, Then Narrow
Begin with hybrid mode for concept discovery. If results cluster around the wrong topic, pivot to keyword mode with exact phrases.
2. Mode Selection Matters
| Situation | Best Mode | Why |
|---|---|---|
| "What was our approach to X?" | Semantic | Finds conceptual matches |
"Find messages about functionName" | Keyword | Exact term, instant |
| "Research topic X comprehensively" | Hybrid | Best of both worlds |
| "When did we discuss ADR-118?" | Keyword | Exact identifier lookup |
3. Timestamps Are Anchors
Session messages have timestamps. Once you find one relevant message, the timestamp tells you the session -- and nearby messages (same date, similar IDs) fill in the full context.
4. Cross-Reference with Codebase
/semantic-search finds the conversation. Follow up with Grep on the codebase to find the code. Then search ADRs for the decision. The three together give the complete picture:
Session Messages (what happened)
+
Source Code (what was built)
+
ADRs (why it was built that way)
=
Complete Understanding
Technical Details
| Metric | Value |
|---|---|
| Database | sessions.db (ADR-118 Tier 3) |
| Embedding Model | all-MiniLM-L6-v2 (384 dimensions) |
| Total Messages | 650,000+ |
| Embedding Coverage | 100% |
| FTS5 Index | BM25 ranking, instant keyword search |
| Hybrid Fusion | Reciprocal Rank Fusion (RRF), 60/40 semantic/FTS5 |
| Script | scripts/semantic-search.py |
| Command | /semantic-search |
Document Path: internal/analysis/semantic-search/semantic-search-power-demo-2026-02-16.md
Related: ADR-080 (Semantic Search), J.4.1 (Semantic Search Implementation), ADR-016 (2FA)