Call Graph Migration Reference
Overview
This document describes the unification of call graph tables with the knowledge graph schema as part of ADR-151 Phase 4 (Call Graph Unification).
ADR: ADR-151
Migration Summary
Before (Legacy)
Call graph data was stored in separate tables:
| Database | Table | Purpose |
|---|---|---|
sessions.db | call_graph_functions | Function definitions (node_id, name, file_path, etc.) |
sessions.db | call_graph_edges | Call relationships (caller_id, callee_name, call_line) |
sessions.db | call_graph_memory | Session-specific context (function-to-message links) |
After (Unified)
Call graph data is unified with the knowledge graph:
| Database | Table | Purpose | Migration Status |
|---|---|---|---|
org.db | kg_nodes | All entities including functions (node_type='function') | COMPLETE |
org.db | kg_edges | All relationships including CALLS (edge_type='CALLS') | COMPLETE |
sessions.db | call_graph_memory | Session context (unchanged - ephemeral) | KEPT SEPARATE |
Schema Mapping
call_graph_functions -> kg_nodes
| Source Column | Target Column | Notes |
|---|---|---|
node_id | properties.call_graph_node_id | Original ID preserved for backward compatibility |
| (generated) | id | Format: function:{file_path}:{name} |
name | name | Display name (with class prefix if method) |
| (fixed) | node_type | Always 'function' |
language | subtype | Python, JavaScript, TypeScript |
file_path | properties.file_path | Full file path |
start_line | properties.line_start | Start line number |
end_line | properties.line_end | End line number |
signature | properties.signature | Function signature |
class_name | properties.class_name | Containing class (if method) |
docstring | properties.docstring | First 500 chars |
call_graph_edges -> kg_edges
| Source Column | Target Column | Notes |
|---|---|---|
| (generated) | id | Format: CALLS:{from_node}:{to_node} |
| (fixed) | edge_type | Always 'CALLS' |
caller_id | from_node | Resolved to kg_node ID |
callee_name | to_node | Resolved to kg_node ID (if found) |
| (aggregated) | properties.call_count | Number of calls between pair |
call_line, call_file | properties.call_contexts | Up to 10 call sites |
call_graph_memory (KEPT SEPARATE)
Decision: Keep call_graph_memory in sessions.db.
Rationale:
- Ephemeral nature: Links functions to session messages, which are session-specific
- Regenerable: Can be rebuilt from session data
- Scope alignment: Session-specific data belongs in sessions.db (Tier 3)
- Performance: Avoids cross-database joins for real-time memory context
Schema (unchanged):
CREATE TABLE call_graph_memory (
id INTEGER PRIMARY KEY AUTOINCREMENT,
node_id TEXT NOT NULL, -- Function node_id
message_id INTEGER, -- FK to messages
session_id TEXT,
change_type TEXT, -- 'created', 'modified', 'discussed', 'reviewed'
timestamp TEXT DEFAULT CURRENT_TIMESTAMP
);
Migration Scripts
Extractor (Phase 2)
Script: scripts/knowledge_graph/extractors/function_extractor.py
Extracts functions from call_graph_functions and creates kg_nodes entries with:
node_type = 'function'subtype = languagepropertiescontaining file_path, line numbers, signature, etc.
Edge Builder (Phase 3)
Script: scripts/knowledge_graph/edge_builders/calls_edge_builder.py
Creates CALLS edges from call_graph_edges:
- Resolves
caller_idto source kg_node - Resolves
callee_nameto target kg_node (same-file preference) - Aggregates multiple calls between same pair into single edge
- Skips built-in functions and unresolved callees
Verification (Phase 4)
Script: scripts/knowledge_graph/verify_call_graph_migration.py
Verifies migration completeness:
# Run verification
python3 scripts/knowledge_graph/verify_call_graph_migration.py
# Verbose output
python3 scripts/knowledge_graph/verify_call_graph_migration.py --verbose
# JSON output
python3 scripts/knowledge_graph/verify_call_graph_migration.py --json
MCP Server Update
Before (Legacy Query)
The MCP server queried sessions.db directly:
# Old approach - direct table access
conn = sqlite3.connect(sessions_db)
cursor = conn.execute("""
SELECT * FROM call_graph_functions WHERE name = ?
""", (name,))
After (Unified Query)
The MCP server now queries kg_nodes and kg_edges in org.db:
# New approach - unified knowledge graph
conn = sqlite3.connect(org_db)
# Get callers via kg_edges
cursor = conn.execute("""
SELECT e.from_node, n.name, n.properties
FROM kg_edges e
JOIN kg_nodes n ON e.from_node = n.id
WHERE e.edge_type = 'CALLS'
AND e.to_node = ?
""", (function_kg_id,))
# Get callees via kg_edges
cursor = conn.execute("""
SELECT e.to_node, n.name, n.properties
FROM kg_edges e
JOIN kg_nodes n ON e.to_node = n.id
WHERE e.edge_type = 'CALLS'
AND e.from_node = ?
""", (function_kg_id,))
Adapter Pattern
For backward compatibility, an adapter maps between:
- Legacy
node_idformat (SHA-256 based) - New
kg_nodeID format (function:{file_path}:{name})
Database Split Summary
┌─────────────────────────────────────────────────────────────────┐
│ ADR-118 + ADR-151 Architecture │
├─────────────────────────────────────────────────────────────────┤
│ │
│ org.db (Tier 2 - CRITICAL) │
│ ├── kg_nodes (all entities) │
│ │ ├── node_type='function' (7,066 nodes from call graph) │
│ │ ├── node_type='component' (5,411 nodes) │
│ │ ├── node_type='decision' (1,856 nodes) │
│ │ └── ... other types │
│ │ │
│ └── kg_edges (all relationships) │
│ ├── edge_type='CALLS' (2,565 edges from call graph) │
│ ├── edge_type='INVOKES' (3,363 edges) │
│ └── ... other types │
│ │
│ sessions.db (Tier 3 - Regenerable) │
│ ├── call_graph_memory (session-specific context) │
│ ├── messages │
│ └── tool_analytics │
│ │
└─────────────────────────────────────────────────────────────────┘
Benefits of Unification
- Single Query Interface: All entity/relationship queries use
kg_nodes/kg_edges - Cross-Domain Traversal: Can traverse from function -> component -> decision
- Consistent Schema: All entities follow same node/edge pattern
- Enhanced Memory Search:
memory_linked_searchcan leverage kg_edges for richer context - Future GraphRAG: Knowledge graph enables graph-aware retrieval
Backward Compatibility
The MCP server maintains the same tool interface:
get_callers(function_name)- Works via kg_edges CALLSget_callees(function_name)- Works via kg_edges CALLScall_chain(from_func, to_func)- Works via multi-hop kg_edges traversalmemory_linked_search(function_name)- Enhanced with kg_nodes metadata
Verification Status
Run verification to check migration status:
python3 scripts/knowledge_graph/verify_call_graph_migration.py --verbose
Expected output for complete migration:
VERIFICATION SUMMARY
============================================================
Overall Status: MIGRATION_COMPLETE
Functions (CP-27):
- Source (call_graph_functions): 7066
- Target (kg_nodes function): 7066
- Status: complete
Edges (CP-28):
- Source (call_graph_edges): 73319
- Target (kg_edges CALLS): 2565
- Status: complete
Memory Split (CP-30):
- call_graph_memory stays in: sessions.db
- Rationale: Session-specific, ephemeral, regenerable
- Status: documented
Note: Target edges are fewer than source due to:
- Aggregation (multiple calls -> single edge)
- Filtering (built-ins, unresolved callees)
- Self-call removal
Track: J (Memory Intelligence) Task: J.3.6 (Call Graph Unification) Created: 2026-02-03 Author: Claude (Opus 4.5)