ADR-202: MCP Call Graph — Memory-Linked Code Navigation
Document: ADR-202-mcp-call-graph-memory-linked-navigation
Version: 1.0.0
Purpose: Document architectural decision for memory-linked call graph MCP server
Audience: Framework contributors, system architects
Date Created: 2026-01-17
Last Updated: 2026-02-15
Status: ACCEPTED
Task ID: H.5.5.2, H.12.1
Related ADRs: ADR-080 (Semantic Search), ADR-118 (Database Architecture), ADR-151 (Context Graph Evolution)
Related Documents:
- tools/mcp-call-graph/server.py
- tools/mcp-call-graph/CLAUDE.md
- tests/tools/test_mcp_call_graph.py
Context and Problem Statement
Background
AI coding assistants need to understand code structure to provide meaningful suggestions, refactoring advice, and impact analysis. The industry standard — AST parsing and call graph navigation — is now table stakes (Cursor, JetBrains AI, Code Pathfinder all offer it). However, no competitor links code navigation to session memory and architectural decisions.
CODITECT's existing infrastructure provides:
- Session memory (ADR-020, ADR-021): Cross-session message history with semantic search
- Decision tracking (org.db): Architectural decisions with rationale and confidence
- Error knowledge base (org.db): Error signatures with verified solutions
- Unified knowledge graph (ADR-151): kg_nodes and kg_edges in org.db
Problem Statement
How do we provide call graph navigation that goes beyond competitors by integrating with CODITECT's memory and decision systems, creating a uniquely valuable developer experience?
Requirements
Must-Have:
- Tree-sitter AST parsing for Python, JavaScript, TypeScript
- Function indexing with signatures, docstrings, class context
- Caller/callee lookup with file and line references
- Call chain discovery between arbitrary functions
- MCP protocol compliance for Claude Code integration
Should-Have:
- Memory-linked search combining call graph with session history
- Integration with architectural decisions (org.db)
- Unified knowledge graph storage (ADR-151 Phase 4)
- Deterministic node IDs for cross-session stability
Nice-to-Have:
- Incremental re-indexing on file change
- Cross-language call tracking
- Interactive call graph visualization
Decision Drivers
Technical Constraints
- T1: Must use tree-sitter for fast, accurate AST parsing (not regex-based)
- T2: Must integrate with ADR-118 database architecture (org.db TIER 2, sessions.db TIER 3)
- T3: Must produce deterministic node IDs for knowledge graph consistency
- T4: Must support ADR-151 Phase 4 unified knowledge graph schema
Performance Goals
- P1: File indexing under 100ms per file
- P2: Caller/callee lookup under 10ms (indexed)
- P3: Memory-linked search under 200ms (includes message search)
- P4: Index 5,000+ functions across 400+ files
Competitive Differentiation
- CD1: Memory context — "when was this function last discussed?"
- CD2: Decision awareness — "which ADRs constrain changes to this function?"
- CD3: Error history — "what past issues involved this function?"
Considered Options
Option A: Wrapper Around Existing AST Tools
Description: Wrap tree-sitter or Code Pathfinder's existing call graph output with MCP endpoints.
Pros: Fast implementation, proven AST parsing Cons: No memory integration, no decision awareness, commodity feature
Rejected: Provides no competitive differentiation over Cursor or JetBrains AI.
Option B: Memory-Linked Call Graph (SELECTED)
Description: Custom call graph MCP server with tree-sitter parsing that integrates with CODITECT's memory system, decision tracking, and unified knowledge graph.
Pros:
- Unique value: memory + decisions + call graph in one query
- Builds on existing infrastructure (org.db, sessions.db)
- Unified knowledge graph enables cross-domain queries
- Deterministic node IDs enable cross-session stability
Cons:
- More complex implementation
- Requires ongoing knowledge graph maintenance
Selected: Only option that provides competitive differentiation.
Decision Outcome
CHOSEN: Option B — Memory-Linked Call Graph
Rationale
- Competitive moat: No competitor links call graphs to session memory and ADRs
- Unified knowledge graph: ADR-151 provides the foundation for cross-domain queries
- Incremental value: Each new MCP server adds edges to the same knowledge graph
- Developer experience: "What decisions affected this function?" is uniquely answerable
Technical Implementation
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ MCP CALL GRAPH SERVER │
├─────────────────────────────────────────────────────────────────┤
│ │
│ MCP Client (Claude Code) │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ MCP Tools ││
│ │ • index_file - Index single source file ││
│ │ • index_directory - Batch index a directory ││
│ │ • get_callers - Who calls this function? ││
│ │ • get_callees - What does this function call? ││
│ │ • call_chain - Find paths between functions ││
│ │ • memory_linked_search - CODITECT UNIQUE (call+memory+ADR) ││
│ │ • call_graph_stats - Database statistics ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ CallGraphParser (tree-sitter) ││
│ │ ││
│ │ Python (.py) JavaScript (.js, .jsx, .mjs, .cjs) ││
│ │ TypeScript (.ts, .tsx) ││
│ │ ││
│ │ Extracts: ││
│ │ • Functions (name, signature, docstring, class context) ││
│ │ • Call edges (caller → callee with line references) ││
│ │ • Node IDs (SHA-256 deterministic) ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ org.db │ │ sessions.db │ │
│ │ (TIER 2) │ │ (TIER 3) │ │
│ │ │ │ │ │
│ │ kg_nodes │ │ call_graph │ │
│ │ kg_edges │ │ _memory │ │
│ │ decisions │ │ messages │ │
│ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Database Schema
Unified Knowledge Graph — org.db (ADR-151 Phase 4):
-- Function nodes in the unified knowledge graph
kg_nodes (
id TEXT PRIMARY KEY, -- Format: function:{file_path}:{name}
node_type TEXT, -- 'function'
subtype TEXT, -- 'python', 'javascript', 'typescript'
name TEXT, -- Display name (ClassName.method or function_name)
properties TEXT -- JSON: {file_path, line_start, line_end,
-- signature, docstring, class_name, node_id}
)
-- CALLS edges between functions
kg_edges (
id TEXT PRIMARY KEY, -- Format: CALLS:{from_node_id}:{to_node_id}
edge_type TEXT, -- 'CALLS'
from_node TEXT, -- FK to kg_nodes (caller)
to_node TEXT, -- FK to kg_nodes (callee)
properties TEXT -- JSON: {call_count, call_contexts[]}
)
Session Memory — sessions.db (Ephemeral):
-- Memory links: which functions were discussed in which sessions
call_graph_memory (
node_id TEXT, -- Function node reference
message_id INTEGER, -- FK to messages table
session_id TEXT, -- Session identifier
change_type TEXT -- 'created', 'modified', 'discussed', 'reviewed'
)
Node ID Generation
Deterministic SHA-256 IDs ensure cross-session stability:
def generate_node_id(file_path: str, name: str, start_line: int) -> str:
"""Generate deterministic node ID using SHA-256."""
content = f"{file_path}:{name}:{start_line}"
return hashlib.sha256(content.encode()).hexdigest()[:16]
This approach matches Code Pathfinder's strategy (SHA-256 for determinism) while adding memory linkage.
MCP Tool Definitions
@mcp_tool
def index_file(file_path: str) -> Dict:
"""Index a single source file into the call graph.
Parses functions and call edges using tree-sitter AST."""
@mcp_tool
def index_directory(dir_path: str, extensions: List[str] = None) -> Dict:
"""Index all source files in a directory.
Default extensions: .py, .js, .ts"""
@mcp_tool
def get_callers(function_name: str, file_path: str = None) -> Dict:
"""Find all functions that call the given function.
Returns caller name, file path, and call line number."""
@mcp_tool
def get_callees(function_name: str, file_path: str = None) -> Dict:
"""Find all functions called by the given function.
Returns callee name, file path, and definition location."""
@mcp_tool
def call_chain(from_func: str, to_func: str, max_depth: int = 5) -> Dict:
"""Find call paths from one function to another.
Uses BFS traversal with configurable max depth."""
@mcp_tool
def memory_linked_search(function_name: str) -> Dict:
"""CODITECT UNIQUE: Search with memory context.
Returns: function info + callers + callees +
related messages from session history +
related architectural decisions from org.db."""
@mcp_tool
def call_graph_stats() -> Dict:
"""Get call graph database statistics.
Returns function count, edge count, file count, language breakdown."""
Tree-sitter Parsing Strategy
| Language | Parser | Function Patterns |
|---|---|---|
| Python | tree-sitter-python | function_definition, class methods, decorators |
| JavaScript | tree-sitter-javascript | function_declaration, arrow functions, methods |
| TypeScript | tree-sitter-javascript | Same as JS with type annotation awareness |
Call Detection:
- Direct function calls:
foo() - Method calls:
self.foo(),obj.foo() - Chained calls:
a.b.c()→ resolves toc - Constructor calls:
Foo()
Value Proposition
What Competitors Provide (Table Stakes)
| Capability | Cursor | JetBrains AI | Code Pathfinder |
|---|---|---|---|
| AST parsing | Yes | Yes | Yes |
| Call graph | Limited | Yes | Yes |
| Caller/callee lookup | No | Yes | Yes |
| MCP integration | No | No | Yes |
What CODITECT Uniquely Provides
| Capability | CODITECT | Why It Matters |
|---|---|---|
| Memory-linked search | Yes | "When was this function last modified across sessions?" |
| Decision awareness | Yes | "Which ADRs constrain changes to this function?" |
| Error history | Yes | "What past bugs involved this function?" |
| Unified knowledge graph | Yes | Functions connected to components, decisions, sessions |
| Cross-session stability | Yes | Deterministic node IDs persist across reindexing |
Example: memory_linked_search
No competitor can answer this query in a single call:
{
"function_name": "hybrid_search",
"functions": [
{"name": "hybrid_search", "file_path": "/path/to/server.py",
"start_line": 241, "signature": "def hybrid_search(query, limit=20, ...)"}
],
"callers": [
{"name": "call_tool", "file_path": "...", "call_line": 611}
],
"callees": [
{"callee_name": "fts5_search"},
{"callee_name": "vector_search"}
],
"related_messages": [
{"content_preview": "Implementing hybrid search with RRF...",
"timestamp": "2026-01-17T10:00:00Z", "session_id": "abc123"}
],
"related_decisions": [
{"decision": "Use RRF fusion for hybrid search",
"rationale": "Combines FTS5 precision with vector recall..."}
]
}
How coditect-core Utilizes This Server
Direct MCP Usage
Claude Code invokes these tools via MCP during development sessions:
get_callers/get_calleesfor understanding code structure before changescall_chainfor tracing execution paths during debuggingmemory_linked_searchfor recalling past context about functions
Integration with Other MCP Servers
| Server | Integration |
|---|---|
| mcp-impact-analysis (ADR-203) | Uses call graph data for blast radius calculation |
| mcp-semantic-search (ADR-080) | Memory-linked search queries messages table |
| mcp-unified-gateway (ADR-204) | Routes call graph tools through single endpoint |
| mcp-context-graph (ADR-151) | Shares kg_nodes/kg_edges in org.db |
Automated Workflows
- Pre-commit hooks: Can trigger reindexing of changed files
/cx --reindex: Manual reindex command- Health monitoring:
mcp-health-check.pyvalidates server availability
Performance Characteristics
| Operation | Latency | Notes |
|---|---|---|
| File indexing | ~50ms/file | Tree-sitter native parsing |
| Caller/callee lookup | <10ms | Indexed queries on kg_edges |
| Call chain (depth 5) | ~50ms | BFS traversal with depth limit |
| Memory-linked search | ~100ms | Includes message + decision queries |
| Batch index (441 files) | ~22s | 5,590 functions, 55,548 edges |
Database Stats (Current):
- Functions indexed: 5,590
- Call edges: 55,548
- Files indexed: 441
- Languages: Python, JavaScript, TypeScript
Consequences
Positive
- P1: Unique competitive moat — no competitor has memory-linked code navigation
- P2: Knowledge graph integration enables cross-domain queries (code → decisions → sessions)
- P3: Deterministic node IDs provide stability across reindexing
- P4: Tree-sitter provides sub-100ms parsing for interactive use
Negative
- N1: Tree-sitter grammars must be maintained per language
- N2: Call detection is name-based (not type-resolved) — may have false positives
- N3: Cross-language calls not yet tracked (Python calling JS)
- N4: Reindexing required on file changes (no automatic file watching)
Dependencies
mcp>=0.9.0 # MCP Python SDK
tree-sitter>=0.21.0 # AST parsing engine
tree-sitter-python>=0.21.0
tree-sitter-javascript>=0.21.0
Files
| File | Purpose |
|---|---|
tools/mcp-call-graph/server.py | MCP server implementation (1335 lines) |
tools/mcp-call-graph/CLAUDE.md | Tool documentation |
tests/tools/test_mcp_call_graph.py | Unit tests (22 tests) |
Validation
- MCP server starts and registers 7 tools
- index_file parses Python/JS/TS with tree-sitter
- get_callers returns direct callers with file/line references
- get_callees returns called functions
- call_chain finds paths between functions via BFS
- memory_linked_search returns call graph + messages + decisions
- call_graph_stats returns database metrics
- All 22 unit tests passing
- ADR-151 Phase 4 compliance (kg_nodes/kg_edges in org.db)
Status: ACCEPTED Decision Date: 2026-01-17 Implementation Status: COMPLETE Task ID: H.5.5.2, H.12.1 Maintainer: CODITECT Core Team Review Date: 2026-04-15 (quarterly review)