ADR-202: MCP Call Graph — Memory-Linked Code Navigation

Document: ADR-202-mcp-call-graph-memory-linked-navigation
Version: 1.0.0
Purpose: Document architectural decision for memory-linked call graph MCP server
Audience: Framework contributors, system architects
Date Created: 2026-01-17
Last Updated: 2026-02-15
Status: ACCEPTED
Task ID: H.5.5.2, H.12.1
Related ADRs: ADR-080 (Semantic Search), ADR-118 (Database Architecture), ADR-151 (Context Graph Evolution)
Related Documents:
  - tools/mcp-call-graph/server.py
  - tools/mcp-call-graph/CLAUDE.md
  - tests/tools/test_mcp_call_graph.py

Context and Problem Statement

Background

AI coding assistants need to understand code structure to provide meaningful suggestions, refactoring advice, and impact analysis. The industry standard — AST parsing and call graph navigation — is now table stakes (Cursor, JetBrains AI, Code Pathfinder all offer it). However, no competitor links code navigation to session memory and architectural decisions.

CODITECT's existing infrastructure provides:

Session memory (ADR-020, ADR-021): Cross-session message history with semantic search
Decision tracking (org.db): Architectural decisions with rationale and confidence
Error knowledge base (org.db): Error signatures with verified solutions
Unified knowledge graph (ADR-151): kg_nodes and kg_edges in org.db

Problem Statement

How do we provide call graph navigation that goes beyond competitors by integrating with CODITECT's memory and decision systems, creating a uniquely valuable developer experience?

Requirements

Must-Have:

Tree-sitter AST parsing for Python, JavaScript, TypeScript
Function indexing with signatures, docstrings, class context
Caller/callee lookup with file and line references
Call chain discovery between arbitrary functions
MCP protocol compliance for Claude Code integration

Should-Have:

Memory-linked search combining call graph with session history
Integration with architectural decisions (org.db)
Unified knowledge graph storage (ADR-151 Phase 4)
Deterministic node IDs for cross-session stability

Nice-to-Have:

Incremental re-indexing on file change
Cross-language call tracking
Interactive call graph visualization

Decision Drivers

Technical Constraints

T1: Must use tree-sitter for fast, accurate AST parsing (not regex-based)
T2: Must integrate with ADR-118 database architecture (org.db TIER 2, sessions.db TIER 3)
T3: Must produce deterministic node IDs for knowledge graph consistency
T4: Must support ADR-151 Phase 4 unified knowledge graph schema

Performance Goals

P1: File indexing under 100ms per file
P2: Caller/callee lookup under 10ms (indexed)
P3: Memory-linked search under 200ms (includes message search)
P4: Index 5,000+ functions across 400+ files

Competitive Differentiation

CD1: Memory context — "when was this function last discussed?"
CD2: Decision awareness — "which ADRs constrain changes to this function?"
CD3: Error history — "what past issues involved this function?"

Considered Options

Option A: Wrapper Around Existing AST Tools

Description: Wrap tree-sitter or Code Pathfinder's existing call graph output with MCP endpoints.

Pros: Fast implementation, proven AST parsing Cons: No memory integration, no decision awareness, commodity feature

Rejected: Provides no competitive differentiation over Cursor or JetBrains AI.

Option B: Memory-Linked Call Graph (SELECTED)

Description: Custom call graph MCP server with tree-sitter parsing that integrates with CODITECT's memory system, decision tracking, and unified knowledge graph.

Pros:

Unique value: memory + decisions + call graph in one query
Builds on existing infrastructure (org.db, sessions.db)
Unified knowledge graph enables cross-domain queries
Deterministic node IDs enable cross-session stability

Cons:

More complex implementation
Requires ongoing knowledge graph maintenance

Selected: Only option that provides competitive differentiation.

Decision Outcome

CHOSEN: Option B — Memory-Linked Call Graph

Rationale

Competitive moat: No competitor links call graphs to session memory and ADRs
Unified knowledge graph: ADR-151 provides the foundation for cross-domain queries
Incremental value: Each new MCP server adds edges to the same knowledge graph
Developer experience: "What decisions affected this function?" is uniquely answerable

Technical Implementation

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    MCP CALL GRAPH SERVER                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  MCP Client (Claude Code)                                        │
│       │                                                          │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ MCP Tools                                                   ││
│  │ • index_file          - Index single source file             ││
│  │ • index_directory     - Batch index a directory              ││
│  │ • get_callers         - Who calls this function?             ││
│  │ • get_callees         - What does this function call?        ││
│  │ • call_chain          - Find paths between functions         ││
│  │ • memory_linked_search - CODITECT UNIQUE (call+memory+ADR)  ││
│  │ • call_graph_stats    - Database statistics                  ││
│  └─────────────────────────────────────────────────────────────┘│
│       │                                                          │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ CallGraphParser (tree-sitter)                               ││
│  │                                                              ││
│  │  Python (.py)       JavaScript (.js, .jsx, .mjs, .cjs)      ││
│  │  TypeScript (.ts, .tsx)                                      ││
│  │                                                              ││
│  │  Extracts:                                                   ││
│  │  • Functions (name, signature, docstring, class context)     ││
│  │  • Call edges (caller → callee with line references)         ││
│  │  • Node IDs (SHA-256 deterministic)                          ││
│  └─────────────────────────────────────────────────────────────┘│
│       │                     │                                    │
│       ▼                     ▼                                    │
│  ┌──────────────┐    ┌──────────────┐                            │
│  │   org.db      │    │ sessions.db  │                            │
│  │   (TIER 2)    │    │ (TIER 3)     │                            │
│  │               │    │              │                            │
│  │ kg_nodes      │    │ call_graph   │                            │
│  │ kg_edges      │    │ _memory      │                            │
│  │ decisions     │    │ messages     │                            │
│  └──────────────┘    └──────────────┘                            │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Database Schema

Unified Knowledge Graph — org.db (ADR-151 Phase 4):

-- Function nodes in the unified knowledge graph
kg_nodes (
    id TEXT PRIMARY KEY,           -- Format: function:{file_path}:{name}
    node_type TEXT,                -- 'function'
    subtype TEXT,                  -- 'python', 'javascript', 'typescript'
    name TEXT,                     -- Display name (ClassName.method or function_name)
    properties TEXT                -- JSON: {file_path, line_start, line_end,
                                   --        signature, docstring, class_name, node_id}
)

-- CALLS edges between functions
kg_edges (
    id TEXT PRIMARY KEY,           -- Format: CALLS:{from_node_id}:{to_node_id}
    edge_type TEXT,                -- 'CALLS'
    from_node TEXT,                -- FK to kg_nodes (caller)
    to_node TEXT,                  -- FK to kg_nodes (callee)
    properties TEXT                -- JSON: {call_count, call_contexts[]}
)

Session Memory — sessions.db (Ephemeral):

-- Memory links: which functions were discussed in which sessions
call_graph_memory (
    node_id TEXT,                  -- Function node reference
    message_id INTEGER,            -- FK to messages table
    session_id TEXT,               -- Session identifier
    change_type TEXT               -- 'created', 'modified', 'discussed', 'reviewed'
)

Node ID Generation

Deterministic SHA-256 IDs ensure cross-session stability:

def generate_node_id(file_path: str, name: str, start_line: int) -> str:
    """Generate deterministic node ID using SHA-256."""
    content = f"{file_path}:{name}:{start_line}"
    return hashlib.sha256(content.encode()).hexdigest()[:16]

This approach matches Code Pathfinder's strategy (SHA-256 for determinism) while adding memory linkage.

MCP Tool Definitions

@mcp_tool
def index_file(file_path: str) -> Dict:
    """Index a single source file into the call graph.
    Parses functions and call edges using tree-sitter AST."""

@mcp_tool
def index_directory(dir_path: str, extensions: List[str] = None) -> Dict:
    """Index all source files in a directory.
    Default extensions: .py, .js, .ts"""

@mcp_tool
def get_callers(function_name: str, file_path: str = None) -> Dict:
    """Find all functions that call the given function.
    Returns caller name, file path, and call line number."""

@mcp_tool
def get_callees(function_name: str, file_path: str = None) -> Dict:
    """Find all functions called by the given function.
    Returns callee name, file path, and definition location."""

@mcp_tool
def call_chain(from_func: str, to_func: str, max_depth: int = 5) -> Dict:
    """Find call paths from one function to another.
    Uses BFS traversal with configurable max depth."""

@mcp_tool
def memory_linked_search(function_name: str) -> Dict:
    """CODITECT UNIQUE: Search with memory context.
    Returns: function info + callers + callees +
    related messages from session history +
    related architectural decisions from org.db."""

@mcp_tool
def call_graph_stats() -> Dict:
    """Get call graph database statistics.
    Returns function count, edge count, file count, language breakdown."""

Tree-sitter Parsing Strategy

Language	Parser	Function Patterns
Python	`tree-sitter-python`	`function_definition`, class methods, decorators
JavaScript	`tree-sitter-javascript`	`function_declaration`, arrow functions, methods
TypeScript	`tree-sitter-javascript`	Same as JS with type annotation awareness

Call Detection:

Direct function calls: foo()
Method calls: self.foo(), obj.foo()
Chained calls: a.b.c() → resolves to c
Constructor calls: Foo()

Value Proposition

What Competitors Provide (Table Stakes)

Capability	Cursor	JetBrains AI	Code Pathfinder
AST parsing	Yes	Yes	Yes
Call graph	Limited	Yes	Yes
Caller/callee lookup	No	Yes	Yes
MCP integration	No	No	Yes

What CODITECT Uniquely Provides

Capability	CODITECT	Why It Matters
Memory-linked search	Yes	"When was this function last modified across sessions?"
Decision awareness	Yes	"Which ADRs constrain changes to this function?"
Error history	Yes	"What past bugs involved this function?"
Unified knowledge graph	Yes	Functions connected to components, decisions, sessions
Cross-session stability	Yes	Deterministic node IDs persist across reindexing

Example: memory_linked_search

No competitor can answer this query in a single call:

{
  "function_name": "hybrid_search",
  "functions": [
    {"name": "hybrid_search", "file_path": "/path/to/server.py",
     "start_line": 241, "signature": "def hybrid_search(query, limit=20, ...)"}
  ],
  "callers": [
    {"name": "call_tool", "file_path": "...", "call_line": 611}
  ],
  "callees": [
    {"callee_name": "fts5_search"},
    {"callee_name": "vector_search"}
  ],
  "related_messages": [
    {"content_preview": "Implementing hybrid search with RRF...",
     "timestamp": "2026-01-17T10:00:00Z", "session_id": "abc123"}
  ],
  "related_decisions": [
    {"decision": "Use RRF fusion for hybrid search",
     "rationale": "Combines FTS5 precision with vector recall..."}
  ]
}

How coditect-core Utilizes This Server

Direct MCP Usage

Claude Code invokes these tools via MCP during development sessions:

get_callers / get_callees for understanding code structure before changes
call_chain for tracing execution paths during debugging
memory_linked_search for recalling past context about functions

Integration with Other MCP Servers

Server	Integration
mcp-impact-analysis (ADR-203)	Uses call graph data for blast radius calculation
mcp-semantic-search (ADR-080)	Memory-linked search queries messages table
mcp-unified-gateway (ADR-204)	Routes call graph tools through single endpoint
mcp-context-graph (ADR-151)	Shares kg_nodes/kg_edges in org.db

Automated Workflows

Pre-commit hooks: Can trigger reindexing of changed files
/cx --reindex: Manual reindex command
Health monitoring: mcp-health-check.py validates server availability

Performance Characteristics

Operation	Latency	Notes
File indexing	~50ms/file	Tree-sitter native parsing
Caller/callee lookup	<10ms	Indexed queries on kg_edges
Call chain (depth 5)	~50ms	BFS traversal with depth limit
Memory-linked search	~100ms	Includes message + decision queries
Batch index (441 files)	~22s	5,590 functions, 55,548 edges

Database Stats (Current):

Functions indexed: 5,590
Call edges: 55,548
Files indexed: 441
Languages: Python, JavaScript, TypeScript

Consequences

Positive

P1: Unique competitive moat — no competitor has memory-linked code navigation
P2: Knowledge graph integration enables cross-domain queries (code → decisions → sessions)
P3: Deterministic node IDs provide stability across reindexing
P4: Tree-sitter provides sub-100ms parsing for interactive use

Negative

N1: Tree-sitter grammars must be maintained per language
N2: Call detection is name-based (not type-resolved) — may have false positives
N3: Cross-language calls not yet tracked (Python calling JS)
N4: Reindexing required on file changes (no automatic file watching)

Dependencies

mcp>=0.9.0              # MCP Python SDK
tree-sitter>=0.21.0     # AST parsing engine
tree-sitter-python>=0.21.0
tree-sitter-javascript>=0.21.0

Files

File	Purpose
`tools/mcp-call-graph/server.py`	MCP server implementation (1335 lines)
`tools/mcp-call-graph/CLAUDE.md`	Tool documentation
`tests/tools/test_mcp_call_graph.py`	Unit tests (22 tests)

Validation

MCP server starts and registers 7 tools
index_file parses Python/JS/TS with tree-sitter
get_callers returns direct callers with file/line references
get_callees returns called functions
call_chain finds paths between functions via BFS
memory_linked_search returns call graph + messages + decisions
call_graph_stats returns database metrics
All 22 unit tests passing
ADR-151 Phase 4 compliance (kg_nodes/kg_edges in org.db)

Status: ACCEPTED Decision Date: 2026-01-17 Implementation Status: COMPLETE Task ID: H.5.5.2, H.12.1 Maintainer: CODITECT Core Team Review Date: 2026-04-15 (quarterly review)

Context and Problem Statement​

Background​

Problem Statement​

Requirements​

Decision Drivers​

Technical Constraints​

Performance Goals​

Competitive Differentiation​

Considered Options​

Option A: Wrapper Around Existing AST Tools​

Option B: Memory-Linked Call Graph (SELECTED)​

Decision Outcome​

Rationale​

Technical Implementation​

Architecture​

Database Schema​

Node ID Generation​

MCP Tool Definitions​

Tree-sitter Parsing Strategy​

Value Proposition​

What Competitors Provide (Table Stakes)​

What CODITECT Uniquely Provides​

Example: memory_linked_search​

How coditect-core Utilizes This Server​

Direct MCP Usage​

Integration with Other MCP Servers​

Automated Workflows​

Performance Characteristics​

Consequences​

Positive​

Negative​

Dependencies​

Files​

Validation​