Skip to main content

Call Graph Migration Reference

Overview

This document describes the unification of call graph tables with the knowledge graph schema as part of ADR-151 Phase 4 (Call Graph Unification).

ADR: ADR-151

Migration Summary

Before (Legacy)

Call graph data was stored in separate tables:

DatabaseTablePurpose
sessions.dbcall_graph_functionsFunction definitions (node_id, name, file_path, etc.)
sessions.dbcall_graph_edgesCall relationships (caller_id, callee_name, call_line)
sessions.dbcall_graph_memorySession-specific context (function-to-message links)

After (Unified)

Call graph data is unified with the knowledge graph:

DatabaseTablePurposeMigration Status
org.dbkg_nodesAll entities including functions (node_type='function')COMPLETE
org.dbkg_edgesAll relationships including CALLS (edge_type='CALLS')COMPLETE
sessions.dbcall_graph_memorySession context (unchanged - ephemeral)KEPT SEPARATE

Schema Mapping

call_graph_functions -> kg_nodes

Source ColumnTarget ColumnNotes
node_idproperties.call_graph_node_idOriginal ID preserved for backward compatibility
(generated)idFormat: function:{file_path}:{name}
namenameDisplay name (with class prefix if method)
(fixed)node_typeAlways 'function'
languagesubtypePython, JavaScript, TypeScript
file_pathproperties.file_pathFull file path
start_lineproperties.line_startStart line number
end_lineproperties.line_endEnd line number
signatureproperties.signatureFunction signature
class_nameproperties.class_nameContaining class (if method)
docstringproperties.docstringFirst 500 chars

call_graph_edges -> kg_edges

Source ColumnTarget ColumnNotes
(generated)idFormat: CALLS:{from_node}:{to_node}
(fixed)edge_typeAlways 'CALLS'
caller_idfrom_nodeResolved to kg_node ID
callee_nameto_nodeResolved to kg_node ID (if found)
(aggregated)properties.call_countNumber of calls between pair
call_line, call_fileproperties.call_contextsUp to 10 call sites

call_graph_memory (KEPT SEPARATE)

Decision: Keep call_graph_memory in sessions.db.

Rationale:

  1. Ephemeral nature: Links functions to session messages, which are session-specific
  2. Regenerable: Can be rebuilt from session data
  3. Scope alignment: Session-specific data belongs in sessions.db (Tier 3)
  4. Performance: Avoids cross-database joins for real-time memory context

Schema (unchanged):

CREATE TABLE call_graph_memory (
id INTEGER PRIMARY KEY AUTOINCREMENT,
node_id TEXT NOT NULL, -- Function node_id
message_id INTEGER, -- FK to messages
session_id TEXT,
change_type TEXT, -- 'created', 'modified', 'discussed', 'reviewed'
timestamp TEXT DEFAULT CURRENT_TIMESTAMP
);

Migration Scripts

Extractor (Phase 2)

Script: scripts/knowledge_graph/extractors/function_extractor.py

Extracts functions from call_graph_functions and creates kg_nodes entries with:

  • node_type = 'function'
  • subtype = language
  • properties containing file_path, line numbers, signature, etc.

Edge Builder (Phase 3)

Script: scripts/knowledge_graph/edge_builders/calls_edge_builder.py

Creates CALLS edges from call_graph_edges:

  • Resolves caller_id to source kg_node
  • Resolves callee_name to target kg_node (same-file preference)
  • Aggregates multiple calls between same pair into single edge
  • Skips built-in functions and unresolved callees

Verification (Phase 4)

Script: scripts/knowledge_graph/verify_call_graph_migration.py

Verifies migration completeness:

# Run verification
python3 scripts/knowledge_graph/verify_call_graph_migration.py

# Verbose output
python3 scripts/knowledge_graph/verify_call_graph_migration.py --verbose

# JSON output
python3 scripts/knowledge_graph/verify_call_graph_migration.py --json

MCP Server Update

Before (Legacy Query)

The MCP server queried sessions.db directly:

# Old approach - direct table access
conn = sqlite3.connect(sessions_db)
cursor = conn.execute("""
SELECT * FROM call_graph_functions WHERE name = ?
""", (name,))

After (Unified Query)

The MCP server now queries kg_nodes and kg_edges in org.db:

# New approach - unified knowledge graph
conn = sqlite3.connect(org_db)

# Get callers via kg_edges
cursor = conn.execute("""
SELECT e.from_node, n.name, n.properties
FROM kg_edges e
JOIN kg_nodes n ON e.from_node = n.id
WHERE e.edge_type = 'CALLS'
AND e.to_node = ?
""", (function_kg_id,))

# Get callees via kg_edges
cursor = conn.execute("""
SELECT e.to_node, n.name, n.properties
FROM kg_edges e
JOIN kg_nodes n ON e.to_node = n.id
WHERE e.edge_type = 'CALLS'
AND e.from_node = ?
""", (function_kg_id,))

Adapter Pattern

For backward compatibility, an adapter maps between:

  • Legacy node_id format (SHA-256 based)
  • New kg_node ID format (function:{file_path}:{name})

Database Split Summary

┌─────────────────────────────────────────────────────────────────┐
│ ADR-118 + ADR-151 Architecture │
├─────────────────────────────────────────────────────────────────┤
│ │
│ org.db (Tier 2 - CRITICAL) │
│ ├── kg_nodes (all entities) │
│ │ ├── node_type='function' (7,066 nodes from call graph) │
│ │ ├── node_type='component' (5,411 nodes) │
│ │ ├── node_type='decision' (1,856 nodes) │
│ │ └── ... other types │
│ │ │
│ └── kg_edges (all relationships) │
│ ├── edge_type='CALLS' (2,565 edges from call graph) │
│ ├── edge_type='INVOKES' (3,363 edges) │
│ └── ... other types │
│ │
│ sessions.db (Tier 3 - Regenerable) │
│ ├── call_graph_memory (session-specific context) │
│ ├── messages │
│ └── tool_analytics │
│ │
└─────────────────────────────────────────────────────────────────┘

Benefits of Unification

  1. Single Query Interface: All entity/relationship queries use kg_nodes/kg_edges
  2. Cross-Domain Traversal: Can traverse from function -> component -> decision
  3. Consistent Schema: All entities follow same node/edge pattern
  4. Enhanced Memory Search: memory_linked_search can leverage kg_edges for richer context
  5. Future GraphRAG: Knowledge graph enables graph-aware retrieval

Backward Compatibility

The MCP server maintains the same tool interface:

  • get_callers(function_name) - Works via kg_edges CALLS
  • get_callees(function_name) - Works via kg_edges CALLS
  • call_chain(from_func, to_func) - Works via multi-hop kg_edges traversal
  • memory_linked_search(function_name) - Enhanced with kg_nodes metadata

Verification Status

Run verification to check migration status:

python3 scripts/knowledge_graph/verify_call_graph_migration.py --verbose

Expected output for complete migration:

VERIFICATION SUMMARY
============================================================
Overall Status: MIGRATION_COMPLETE

Functions (CP-27):
- Source (call_graph_functions): 7066
- Target (kg_nodes function): 7066
- Status: complete

Edges (CP-28):
- Source (call_graph_edges): 73319
- Target (kg_edges CALLS): 2565
- Status: complete

Memory Split (CP-30):
- call_graph_memory stays in: sessions.db
- Rationale: Session-specific, ephemeral, regenerable
- Status: documented

Note: Target edges are fewer than source due to:

  • Aggregation (multiple calls -> single edge)
  • Filtering (built-ins, unresolved callees)
  • Self-call removal

Track: J (Memory Intelligence) Task: J.3.6 (Call Graph Unification) Created: 2026-02-03 Author: Claude (Opus 4.5)