Skip to main content

ADR-202: MCP Call Graph — Memory-Linked Code Navigation

Document: ADR-202-mcp-call-graph-memory-linked-navigation
Version: 1.0.0
Purpose: Document architectural decision for memory-linked call graph MCP server
Audience: Framework contributors, system architects
Date Created: 2026-01-17
Last Updated: 2026-02-15
Status: ACCEPTED
Task ID: H.5.5.2, H.12.1
Related ADRs: ADR-080 (Semantic Search), ADR-118 (Database Architecture), ADR-151 (Context Graph Evolution)
Related Documents:
- tools/mcp-call-graph/server.py
- tools/mcp-call-graph/CLAUDE.md
- tests/tools/test_mcp_call_graph.py

Context and Problem Statement

Background

AI coding assistants need to understand code structure to provide meaningful suggestions, refactoring advice, and impact analysis. The industry standard — AST parsing and call graph navigation — is now table stakes (Cursor, JetBrains AI, Code Pathfinder all offer it). However, no competitor links code navigation to session memory and architectural decisions.

CODITECT's existing infrastructure provides:

  1. Session memory (ADR-020, ADR-021): Cross-session message history with semantic search
  2. Decision tracking (org.db): Architectural decisions with rationale and confidence
  3. Error knowledge base (org.db): Error signatures with verified solutions
  4. Unified knowledge graph (ADR-151): kg_nodes and kg_edges in org.db

Problem Statement

How do we provide call graph navigation that goes beyond competitors by integrating with CODITECT's memory and decision systems, creating a uniquely valuable developer experience?

Requirements

Must-Have:

  • Tree-sitter AST parsing for Python, JavaScript, TypeScript
  • Function indexing with signatures, docstrings, class context
  • Caller/callee lookup with file and line references
  • Call chain discovery between arbitrary functions
  • MCP protocol compliance for Claude Code integration

Should-Have:

  • Memory-linked search combining call graph with session history
  • Integration with architectural decisions (org.db)
  • Unified knowledge graph storage (ADR-151 Phase 4)
  • Deterministic node IDs for cross-session stability

Nice-to-Have:

  • Incremental re-indexing on file change
  • Cross-language call tracking
  • Interactive call graph visualization

Decision Drivers

Technical Constraints

  • T1: Must use tree-sitter for fast, accurate AST parsing (not regex-based)
  • T2: Must integrate with ADR-118 database architecture (org.db TIER 2, sessions.db TIER 3)
  • T3: Must produce deterministic node IDs for knowledge graph consistency
  • T4: Must support ADR-151 Phase 4 unified knowledge graph schema

Performance Goals

  • P1: File indexing under 100ms per file
  • P2: Caller/callee lookup under 10ms (indexed)
  • P3: Memory-linked search under 200ms (includes message search)
  • P4: Index 5,000+ functions across 400+ files

Competitive Differentiation

  • CD1: Memory context — "when was this function last discussed?"
  • CD2: Decision awareness — "which ADRs constrain changes to this function?"
  • CD3: Error history — "what past issues involved this function?"

Considered Options

Option A: Wrapper Around Existing AST Tools

Description: Wrap tree-sitter or Code Pathfinder's existing call graph output with MCP endpoints.

Pros: Fast implementation, proven AST parsing Cons: No memory integration, no decision awareness, commodity feature

Rejected: Provides no competitive differentiation over Cursor or JetBrains AI.

Option B: Memory-Linked Call Graph (SELECTED)

Description: Custom call graph MCP server with tree-sitter parsing that integrates with CODITECT's memory system, decision tracking, and unified knowledge graph.

Pros:

  • Unique value: memory + decisions + call graph in one query
  • Builds on existing infrastructure (org.db, sessions.db)
  • Unified knowledge graph enables cross-domain queries
  • Deterministic node IDs enable cross-session stability

Cons:

  • More complex implementation
  • Requires ongoing knowledge graph maintenance

Selected: Only option that provides competitive differentiation.


Decision Outcome

CHOSEN: Option B — Memory-Linked Call Graph

Rationale

  1. Competitive moat: No competitor links call graphs to session memory and ADRs
  2. Unified knowledge graph: ADR-151 provides the foundation for cross-domain queries
  3. Incremental value: Each new MCP server adds edges to the same knowledge graph
  4. Developer experience: "What decisions affected this function?" is uniquely answerable

Technical Implementation

Architecture

┌─────────────────────────────────────────────────────────────────┐
│ MCP CALL GRAPH SERVER │
├─────────────────────────────────────────────────────────────────┤
│ │
│ MCP Client (Claude Code) │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ MCP Tools ││
│ │ • index_file - Index single source file ││
│ │ • index_directory - Batch index a directory ││
│ │ • get_callers - Who calls this function? ││
│ │ • get_callees - What does this function call? ││
│ │ • call_chain - Find paths between functions ││
│ │ • memory_linked_search - CODITECT UNIQUE (call+memory+ADR) ││
│ │ • call_graph_stats - Database statistics ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ CallGraphParser (tree-sitter) ││
│ │ ││
│ │ Python (.py) JavaScript (.js, .jsx, .mjs, .cjs) ││
│ │ TypeScript (.ts, .tsx) ││
│ │ ││
│ │ Extracts: ││
│ │ • Functions (name, signature, docstring, class context) ││
│ │ • Call edges (caller → callee with line references) ││
│ │ • Node IDs (SHA-256 deterministic) ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ org.db │ │ sessions.db │ │
│ │ (TIER 2) │ │ (TIER 3) │ │
│ │ │ │ │ │
│ │ kg_nodes │ │ call_graph │ │
│ │ kg_edges │ │ _memory │ │
│ │ decisions │ │ messages │ │
│ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

Database Schema

Unified Knowledge Graph — org.db (ADR-151 Phase 4):

-- Function nodes in the unified knowledge graph
kg_nodes (
id TEXT PRIMARY KEY, -- Format: function:{file_path}:{name}
node_type TEXT, -- 'function'
subtype TEXT, -- 'python', 'javascript', 'typescript'
name TEXT, -- Display name (ClassName.method or function_name)
properties TEXT -- JSON: {file_path, line_start, line_end,
-- signature, docstring, class_name, node_id}
)

-- CALLS edges between functions
kg_edges (
id TEXT PRIMARY KEY, -- Format: CALLS:{from_node_id}:{to_node_id}
edge_type TEXT, -- 'CALLS'
from_node TEXT, -- FK to kg_nodes (caller)
to_node TEXT, -- FK to kg_nodes (callee)
properties TEXT -- JSON: {call_count, call_contexts[]}
)

Session Memory — sessions.db (Ephemeral):

-- Memory links: which functions were discussed in which sessions
call_graph_memory (
node_id TEXT, -- Function node reference
message_id INTEGER, -- FK to messages table
session_id TEXT, -- Session identifier
change_type TEXT -- 'created', 'modified', 'discussed', 'reviewed'
)

Node ID Generation

Deterministic SHA-256 IDs ensure cross-session stability:

def generate_node_id(file_path: str, name: str, start_line: int) -> str:
"""Generate deterministic node ID using SHA-256."""
content = f"{file_path}:{name}:{start_line}"
return hashlib.sha256(content.encode()).hexdigest()[:16]

This approach matches Code Pathfinder's strategy (SHA-256 for determinism) while adding memory linkage.

MCP Tool Definitions

@mcp_tool
def index_file(file_path: str) -> Dict:
"""Index a single source file into the call graph.
Parses functions and call edges using tree-sitter AST."""

@mcp_tool
def index_directory(dir_path: str, extensions: List[str] = None) -> Dict:
"""Index all source files in a directory.
Default extensions: .py, .js, .ts"""

@mcp_tool
def get_callers(function_name: str, file_path: str = None) -> Dict:
"""Find all functions that call the given function.
Returns caller name, file path, and call line number."""

@mcp_tool
def get_callees(function_name: str, file_path: str = None) -> Dict:
"""Find all functions called by the given function.
Returns callee name, file path, and definition location."""

@mcp_tool
def call_chain(from_func: str, to_func: str, max_depth: int = 5) -> Dict:
"""Find call paths from one function to another.
Uses BFS traversal with configurable max depth."""

@mcp_tool
def memory_linked_search(function_name: str) -> Dict:
"""CODITECT UNIQUE: Search with memory context.
Returns: function info + callers + callees +
related messages from session history +
related architectural decisions from org.db."""

@mcp_tool
def call_graph_stats() -> Dict:
"""Get call graph database statistics.
Returns function count, edge count, file count, language breakdown."""

Tree-sitter Parsing Strategy

LanguageParserFunction Patterns
Pythontree-sitter-pythonfunction_definition, class methods, decorators
JavaScripttree-sitter-javascriptfunction_declaration, arrow functions, methods
TypeScripttree-sitter-javascriptSame as JS with type annotation awareness

Call Detection:

  • Direct function calls: foo()
  • Method calls: self.foo(), obj.foo()
  • Chained calls: a.b.c() → resolves to c
  • Constructor calls: Foo()

Value Proposition

What Competitors Provide (Table Stakes)

CapabilityCursorJetBrains AICode Pathfinder
AST parsingYesYesYes
Call graphLimitedYesYes
Caller/callee lookupNoYesYes
MCP integrationNoNoYes

What CODITECT Uniquely Provides

CapabilityCODITECTWhy It Matters
Memory-linked searchYes"When was this function last modified across sessions?"
Decision awarenessYes"Which ADRs constrain changes to this function?"
Error historyYes"What past bugs involved this function?"
Unified knowledge graphYesFunctions connected to components, decisions, sessions
Cross-session stabilityYesDeterministic node IDs persist across reindexing

No competitor can answer this query in a single call:

{
"function_name": "hybrid_search",
"functions": [
{"name": "hybrid_search", "file_path": "/path/to/server.py",
"start_line": 241, "signature": "def hybrid_search(query, limit=20, ...)"}
],
"callers": [
{"name": "call_tool", "file_path": "...", "call_line": 611}
],
"callees": [
{"callee_name": "fts5_search"},
{"callee_name": "vector_search"}
],
"related_messages": [
{"content_preview": "Implementing hybrid search with RRF...",
"timestamp": "2026-01-17T10:00:00Z", "session_id": "abc123"}
],
"related_decisions": [
{"decision": "Use RRF fusion for hybrid search",
"rationale": "Combines FTS5 precision with vector recall..."}
]
}

How coditect-core Utilizes This Server

Direct MCP Usage

Claude Code invokes these tools via MCP during development sessions:

  • get_callers / get_callees for understanding code structure before changes
  • call_chain for tracing execution paths during debugging
  • memory_linked_search for recalling past context about functions

Integration with Other MCP Servers

ServerIntegration
mcp-impact-analysis (ADR-203)Uses call graph data for blast radius calculation
mcp-semantic-search (ADR-080)Memory-linked search queries messages table
mcp-unified-gateway (ADR-204)Routes call graph tools through single endpoint
mcp-context-graph (ADR-151)Shares kg_nodes/kg_edges in org.db

Automated Workflows

  • Pre-commit hooks: Can trigger reindexing of changed files
  • /cx --reindex: Manual reindex command
  • Health monitoring: mcp-health-check.py validates server availability

Performance Characteristics

OperationLatencyNotes
File indexing~50ms/fileTree-sitter native parsing
Caller/callee lookup<10msIndexed queries on kg_edges
Call chain (depth 5)~50msBFS traversal with depth limit
Memory-linked search~100msIncludes message + decision queries
Batch index (441 files)~22s5,590 functions, 55,548 edges

Database Stats (Current):

  • Functions indexed: 5,590
  • Call edges: 55,548
  • Files indexed: 441
  • Languages: Python, JavaScript, TypeScript

Consequences

Positive

  • P1: Unique competitive moat — no competitor has memory-linked code navigation
  • P2: Knowledge graph integration enables cross-domain queries (code → decisions → sessions)
  • P3: Deterministic node IDs provide stability across reindexing
  • P4: Tree-sitter provides sub-100ms parsing for interactive use

Negative

  • N1: Tree-sitter grammars must be maintained per language
  • N2: Call detection is name-based (not type-resolved) — may have false positives
  • N3: Cross-language calls not yet tracked (Python calling JS)
  • N4: Reindexing required on file changes (no automatic file watching)

Dependencies

mcp>=0.9.0              # MCP Python SDK
tree-sitter>=0.21.0 # AST parsing engine
tree-sitter-python>=0.21.0
tree-sitter-javascript>=0.21.0

Files

FilePurpose
tools/mcp-call-graph/server.pyMCP server implementation (1335 lines)
tools/mcp-call-graph/CLAUDE.mdTool documentation
tests/tools/test_mcp_call_graph.pyUnit tests (22 tests)

Validation

  • MCP server starts and registers 7 tools
  • index_file parses Python/JS/TS with tree-sitter
  • get_callers returns direct callers with file/line references
  • get_callees returns called functions
  • call_chain finds paths between functions via BFS
  • memory_linked_search returns call graph + messages + decisions
  • call_graph_stats returns database metrics
  • All 22 unit tests passing
  • ADR-151 Phase 4 compliance (kg_nodes/kg_edges in org.db)

Status: ACCEPTED Decision Date: 2026-01-17 Implementation Status: COMPLETE Task ID: H.5.5.2, H.12.1 Maintainer: CODITECT Core Team Review Date: 2026-04-15 (quarterly review)