Skip to main content

ADR-204: MCP Unified Gateway — Single-Endpoint MCP Aggregator

Document: ADR-204-mcp-unified-gateway-single-endpoint
Version: 1.0.0
Purpose: Document architectural decision for unified MCP gateway aggregating 8 backend servers
Audience: Framework contributors, system architects
Date Created: 2026-02-05
Last Updated: 2026-02-15
Status: ACCEPTED
Task ID: J.4.11.4, H.12.3
Related ADRs: ADR-080, ADR-135, ADR-151, ADR-190, ADR-202, ADR-203
Related Documents:
- tools/mcp-unified-gateway/server.py
- tools/mcp-unified-gateway/CLAUDE.md

Context and Problem Statement

Background

CODITECT's MCP infrastructure grew to 8 independent servers providing 74+ tools:

ServerToolsADR
mcp-semantic-search8ADR-080
mcp-call-graph7ADR-202
mcp-impact-analysis5ADR-203
mcp-context-graph16ADR-151/154
mcp-cross-llm-bridge17ADR-190
mcp-skill-server15ADR-135
mcp-backup5ADR-184/185
mcp-git-status5ADR-205
Total78

Each server requires its own entry in .mcp.json or ~/.claude/settings.json, creating configuration complexity and startup overhead.

Problem Statement

How do we simplify MCP configuration from 8 separate server entries to a single endpoint while maintaining modularity, performance, and backward compatibility?

Requirements

Must-Have:

  • Single MCP endpoint replacing 8 separate configurations
  • Backward-compatible tool names (no namespace prefixes)
  • Lazy loading — backends loaded only when their tools are first called
  • Dynamic routing — tool calls forwarded to correct backend

Should-Have:

  • CLI mode for diagnostics (list tools, show stats)
  • Error wrapping with backend context
  • Predefined tool fallbacks when dynamic loading fails

Nice-to-Have:

  • Backend health monitoring
  • Tool call analytics
  • Hot reload of backend modules

Decision Drivers

Configuration Complexity

  • C1: 8 MCP server entries in settings.json is unwieldy
  • C2: Each new MCP server adds another config entry
  • C3: Users must maintain correct paths for all 8 servers

Performance

  • P1: Loading all 8 backends at startup wastes memory
  • P2: Most sessions use only 2-3 backends
  • P3: Lazy loading keeps startup time under 1 second

Maintainability

  • M1: Individual servers remain independently testable
  • M2: New backends can be added without modifying existing code
  • M3: Backend failures should be isolated (one crash doesn't affect others)

Considered Options

Option A: Namespace-Based Aggregation

Description: Prefix tool names with backend name (e.g., call_graph.get_callers).

Pros: Clear tool provenance, no name collisions Cons: Breaking change — all existing tool references must be updated

Rejected: Breaking backward compatibility is unacceptable.

Option B: Single Gateway with Lazy Loading (SELECTED)

Description: Single MCP server that routes tool calls to lazily-loaded backend modules using a tool→backend registry.

Pros:

  • Single configuration entry
  • Backward-compatible tool names
  • Lazy loading minimizes memory
  • Backend isolation via independent module loading

Cons:

  • Additional routing layer
  • Must maintain tool→backend mapping

Selected: Best balance of simplicity, performance, and compatibility.

Option C: Process-Level Multiplexer

Description: Spawn backend servers as child processes and proxy MCP messages.

Pros: Maximum isolation (process boundaries) Cons: High overhead (8 Python processes), complex IPC, slower startup

Rejected: Excessive resource usage for single-user CLI tool.


Decision Outcome

CHOSEN: Option B — Single Gateway with Lazy Loading

Rationale

  1. Simplicity: One config entry replaces eight
  2. Performance: Lazy loading means only used backends consume memory
  3. Compatibility: Tool names unchanged — zero migration effort
  4. Extensibility: Adding a 9th backend requires only a BackendConfig entry

Technical Implementation

Architecture

┌─────────────────────────────────────────────────────────────────┐
│ Claude Code / MCP Client │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Unified Gateway (tools/mcp-unified-gateway) │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌──────────────────┐ │ │
│ │ │ _tool_registry │ │ BackendLoader │ │ │
│ │ │ │ │ │ │ │
│ │ │ hybrid_search │ │ load_backend() │ │ │
│ │ │ → semantic_srch │ │ importlib.util │ │ │
│ │ │ get_callers │ │ spec_from_file │ │ │
│ │ │ → call_graph │ │ │ │ │
│ │ │ analyze_impact │ │ _loaded_modules │ │ │
│ │ │ → impact_anlys │ │ {name: module} │ │ │
│ │ │ ...74+ mappings │ │ │ │ │
│ │ └─────────────────┘ └──────────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │ │ │ │ │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │backup │ │call_ │ │context │ │cross_ │ │semantic│ │
│ │(5) │ │graph(7)│ │graph │ │llm(17) │ │search │ │
│ │ │ │ │ │(16) │ │ │ │(8) │ │
│ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │git_ │ │impact_ │ │skill_ │ │
│ │status │ │analysis│ │server │ │
│ │(5) │ │(5) │ │(15) │ │
│ └────────┘ └────────┘ └────────┘ │
└─────────────────────────────────────────────────────────────────┘

Backend Configuration

BACKENDS = {
"backup": BackendConfig("backup", "mcp-backup/server.py", "", "Backup and restore", True),
"call_graph": BackendConfig("call_graph", "mcp-call-graph/server.py", "", "Call graph navigation", True),
"context_graph": BackendConfig("context_graph", "mcp-context-graph/server.py", "", "Context graph ops", True),
"cross_llm": BackendConfig("cross_llm", "mcp-cross-llm-bridge/server.py","","Cross-LLM orchestration", True),
"git_status": BackendConfig("git_status", "mcp-git-status/server.py", "", "Git submodule status", True),
"impact_analysis":BackendConfig("impact_analysis","mcp-impact-analysis/server.py","", "Impact analysis", True),
"semantic_search":BackendConfig("semantic_search","mcp-semantic-search/server.py","", "Semantic search", True),
"skill_server": BackendConfig("skill_server", "mcp-skill-server/server.py", "", "Skill loading", True),
}

Lazy Loading Strategy

class BackendLoader:
def __init__(self):
self._loaded_modules = {} # Only populated on first call
self._tool_registry = {} # tool_name → backend_name

def load_backend(self, backend_name: str) -> Module:
"""Load backend module only on first tool call.

Uses importlib.util.spec_from_file_location() for
dynamic module loading without import side effects.
"""
if backend_name in self._loaded_modules:
return self._loaded_modules[backend_name]

config = BACKENDS[backend_name]
spec = importlib.util.spec_from_file_location(
f"backend_{backend_name}",
os.path.join(TOOLS_DIR, config.module_path)
)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)

self._loaded_modules[backend_name] = module
return module

Tool Call Routing

When a tool is called, the gateway:

  1. Looks up tool_name in _tool_registry to find backend_name
  2. Lazy-loads the backend module if not already loaded
  3. Tries 3 call strategies in order:
    • module.call_tool(tool_name, arguments) — standard MCP pattern
    • module.handle_{tool_name}(arguments) — handler pattern
    • module.{tool_name}(arguments) — direct function call
  4. Returns result or wraps error with backend context

Tool Name Compatibility

Tools retain their original names for backward compatibility:

hybrid_search          (not semantic_search.hybrid_search)
get_callers (not call_graph.get_callers)
analyze_impact (not impact_analysis.analyze_impact)
backup_status (not backup.backup_status)

This ensures existing Claude Code sessions, scripts, and documentation continue working.

MCP Configuration

Before (8 entries):

{
"mcpServers": {
"coditect-semantic-search": { "command": "python3", "args": ["...mcp-semantic-search/server.py", "mcp"] },
"coditect-call-graph": { "command": "python3", "args": ["...mcp-call-graph/server.py", "mcp"] },
"coditect-impact-analysis": { "command": "python3", "args": ["...mcp-impact-analysis/server.py", "mcp"] },
"coditect-context-graph": { "command": "python3", "args": ["...mcp-context-graph/server.py", "mcp"] },
"coditect-cross-llm": { "command": "python3", "args": ["...mcp-cross-llm-bridge/server.py", "mcp"] },
"coditect-skill-server": { "command": "python3", "args": ["...mcp-skill-server/server.py", "mcp"] },
"coditect-backup": { "command": "python3", "args": ["...mcp-backup/server.py", "mcp"] },
"coditect-git-status": { "command": "python3", "args": ["...mcp-git-status/server.py", "mcp"] }
}
}

After (1 entry):

{
"mcpServers": {
"coditect-unified": {
"command": "python3",
"args": [
"~/.coditect/tools/mcp-unified-gateway/server.py",
"mcp"
]
}
}
}

Value Proposition

Configuration Simplicity

  • 87.5% reduction in MCP configuration entries (8 → 1)
  • Single path to maintain across machines
  • New backends added transparently

Memory Efficiency

  • Startup loads only tool metadata, not backend modules
  • Typical session loads 2-3 backends out of 8
  • Memory savings: ~40-60% compared to loading all backends

Developer Experience

  • python3 server.py list — see all 74+ tools in one place
  • python3 server.py stats — gateway-wide statistics
  • Error messages include backend context for debugging

How coditect-core Utilizes This Server

Primary Use Case

The unified gateway is the recommended MCP configuration for Claude Code users. It replaces the need to configure individual servers.

CLI Diagnostics

# List all available tools across all backends
python3 server.py list

# Show gateway statistics (tools per backend, loaded modules)
python3 server.py stats

# Run as MCP server (normal mode)
python3 server.py mcp

Health Monitoring

The gateway integrates with scripts/mcp-health-check.py:

  • Individual backend health is checked via the gateway
  • Gateway itself is monitored for startup and routing failures

Consequences

Positive

  • P1: Dramatic simplification of MCP configuration
  • P2: Lazy loading reduces memory footprint
  • P3: Tool names unchanged — zero migration cost
  • P4: New backends can be added with just a BackendConfig entry

Negative

  • N1: Additional routing layer adds ~1ms latency per call
  • N2: Backend loading errors surface at tool call time, not startup
  • N3: Debugging requires understanding gateway → backend routing
  • N4: Predefined tool fallbacks must be maintained manually

Dependencies

mcp>=0.9.0              # MCP Python SDK
# Backend dependencies loaded on-demand with each backend

Files

FilePurpose
tools/mcp-unified-gateway/server.pyGateway implementation (505 lines)
tools/mcp-unified-gateway/CLAUDE.mdTool documentation

Validation

  • Single MCP endpoint serves all 74+ tools
  • Lazy loading confirmed — backends load only on first call
  • Tool names backward compatible (no namespace prefixes)
  • 3 call strategies handle different backend patterns
  • Error wrapping includes backend context
  • CLI mode: list and stats commands functional
  • Predefined tool fallbacks work when dynamic loading fails

Status: ACCEPTED Decision Date: 2026-02-05 Implementation Status: COMPLETE Task ID: J.4.11.4, H.12.3 Maintainer: CODITECT Core Team Review Date: 2026-05-05 (quarterly review)