ADR-204: MCP Unified Gateway — Single-Endpoint MCP Aggregator
Document: ADR-204-mcp-unified-gateway-single-endpoint
Version: 1.0.0
Purpose: Document architectural decision for unified MCP gateway aggregating 8 backend servers
Audience: Framework contributors, system architects
Date Created: 2026-02-05
Last Updated: 2026-02-15
Status: ACCEPTED
Task ID: J.4.11.4, H.12.3
Related ADRs: ADR-080, ADR-135, ADR-151, ADR-190, ADR-202, ADR-203
Related Documents:
- tools/mcp-unified-gateway/server.py
- tools/mcp-unified-gateway/CLAUDE.md
Context and Problem Statement
Background
CODITECT's MCP infrastructure grew to 8 independent servers providing 74+ tools:
| Server | Tools | ADR |
|---|---|---|
| mcp-semantic-search | 8 | ADR-080 |
| mcp-call-graph | 7 | ADR-202 |
| mcp-impact-analysis | 5 | ADR-203 |
| mcp-context-graph | 16 | ADR-151/154 |
| mcp-cross-llm-bridge | 17 | ADR-190 |
| mcp-skill-server | 15 | ADR-135 |
| mcp-backup | 5 | ADR-184/185 |
| mcp-git-status | 5 | ADR-205 |
| Total | 78 |
Each server requires its own entry in .mcp.json or ~/.claude/settings.json, creating configuration complexity and startup overhead.
Problem Statement
How do we simplify MCP configuration from 8 separate server entries to a single endpoint while maintaining modularity, performance, and backward compatibility?
Requirements
Must-Have:
- Single MCP endpoint replacing 8 separate configurations
- Backward-compatible tool names (no namespace prefixes)
- Lazy loading — backends loaded only when their tools are first called
- Dynamic routing — tool calls forwarded to correct backend
Should-Have:
- CLI mode for diagnostics (list tools, show stats)
- Error wrapping with backend context
- Predefined tool fallbacks when dynamic loading fails
Nice-to-Have:
- Backend health monitoring
- Tool call analytics
- Hot reload of backend modules
Decision Drivers
Configuration Complexity
- C1: 8 MCP server entries in settings.json is unwieldy
- C2: Each new MCP server adds another config entry
- C3: Users must maintain correct paths for all 8 servers
Performance
- P1: Loading all 8 backends at startup wastes memory
- P2: Most sessions use only 2-3 backends
- P3: Lazy loading keeps startup time under 1 second
Maintainability
- M1: Individual servers remain independently testable
- M2: New backends can be added without modifying existing code
- M3: Backend failures should be isolated (one crash doesn't affect others)
Considered Options
Option A: Namespace-Based Aggregation
Description: Prefix tool names with backend name (e.g., call_graph.get_callers).
Pros: Clear tool provenance, no name collisions Cons: Breaking change — all existing tool references must be updated
Rejected: Breaking backward compatibility is unacceptable.
Option B: Single Gateway with Lazy Loading (SELECTED)
Description: Single MCP server that routes tool calls to lazily-loaded backend modules using a tool→backend registry.
Pros:
- Single configuration entry
- Backward-compatible tool names
- Lazy loading minimizes memory
- Backend isolation via independent module loading
Cons:
- Additional routing layer
- Must maintain tool→backend mapping
Selected: Best balance of simplicity, performance, and compatibility.
Option C: Process-Level Multiplexer
Description: Spawn backend servers as child processes and proxy MCP messages.
Pros: Maximum isolation (process boundaries) Cons: High overhead (8 Python processes), complex IPC, slower startup
Rejected: Excessive resource usage for single-user CLI tool.
Decision Outcome
CHOSEN: Option B — Single Gateway with Lazy Loading
Rationale
- Simplicity: One config entry replaces eight
- Performance: Lazy loading means only used backends consume memory
- Compatibility: Tool names unchanged — zero migration effort
- Extensibility: Adding a 9th backend requires only a BackendConfig entry
Technical Implementation
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Claude Code / MCP Client │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Unified Gateway (tools/mcp-unified-gateway) │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌──────────────────┐ │ │
│ │ │ _tool_registry │ │ BackendLoader │ │ │
│ │ │ │ │ │ │ │
│ │ │ hybrid_search │ │ load_backend() │ │ │
│ │ │ → semantic_srch │ │ importlib.util │ │ │
│ │ │ get_callers │ │ spec_from_file │ │ │
│ │ │ → call_graph │ │ │ │ │
│ │ │ analyze_impact │ │ _loaded_modules │ │ │
│ │ │ → impact_anlys │ │ {name: module} │ │ │
│ │ │ ...74+ mappings │ │ │ │ │
│ │ └─────────────────┘ └──────────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │ │ │ │ │
│ ▼ ▼ ▼ ▼ ▼ │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │backup │ │call_ │ │context │ │cross_ │ │semantic│ │
│ │(5) │ │graph(7)│ │graph │ │llm(17) │ │search │ │
│ │ │ │ │ │(16) │ │ │ │(8) │ │
│ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │git_ │ │impact_ │ │skill_ │ │
│ │status │ │analysis│ │server │ │
│ │(5) │ │(5) │ │(15) │ │
│ └────────┘ └────────┘ └────────┘ │
└─────────────────────────────────────────────────────────────────┘
Backend Configuration
BACKENDS = {
"backup": BackendConfig("backup", "mcp-backup/server.py", "", "Backup and restore", True),
"call_graph": BackendConfig("call_graph", "mcp-call-graph/server.py", "", "Call graph navigation", True),
"context_graph": BackendConfig("context_graph", "mcp-context-graph/server.py", "", "Context graph ops", True),
"cross_llm": BackendConfig("cross_llm", "mcp-cross-llm-bridge/server.py","","Cross-LLM orchestration", True),
"git_status": BackendConfig("git_status", "mcp-git-status/server.py", "", "Git submodule status", True),
"impact_analysis":BackendConfig("impact_analysis","mcp-impact-analysis/server.py","", "Impact analysis", True),
"semantic_search":BackendConfig("semantic_search","mcp-semantic-search/server.py","", "Semantic search", True),
"skill_server": BackendConfig("skill_server", "mcp-skill-server/server.py", "", "Skill loading", True),
}
Lazy Loading Strategy
class BackendLoader:
def __init__(self):
self._loaded_modules = {} # Only populated on first call
self._tool_registry = {} # tool_name → backend_name
def load_backend(self, backend_name: str) -> Module:
"""Load backend module only on first tool call.
Uses importlib.util.spec_from_file_location() for
dynamic module loading without import side effects.
"""
if backend_name in self._loaded_modules:
return self._loaded_modules[backend_name]
config = BACKENDS[backend_name]
spec = importlib.util.spec_from_file_location(
f"backend_{backend_name}",
os.path.join(TOOLS_DIR, config.module_path)
)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
self._loaded_modules[backend_name] = module
return module
Tool Call Routing
When a tool is called, the gateway:
- Looks up
tool_namein_tool_registryto findbackend_name - Lazy-loads the backend module if not already loaded
- Tries 3 call strategies in order:
module.call_tool(tool_name, arguments)— standard MCP patternmodule.handle_{tool_name}(arguments)— handler patternmodule.{tool_name}(arguments)— direct function call
- Returns result or wraps error with backend context
Tool Name Compatibility
Tools retain their original names for backward compatibility:
hybrid_search (not semantic_search.hybrid_search)
get_callers (not call_graph.get_callers)
analyze_impact (not impact_analysis.analyze_impact)
backup_status (not backup.backup_status)
This ensures existing Claude Code sessions, scripts, and documentation continue working.
MCP Configuration
Before (8 entries):
{
"mcpServers": {
"coditect-semantic-search": { "command": "python3", "args": ["...mcp-semantic-search/server.py", "mcp"] },
"coditect-call-graph": { "command": "python3", "args": ["...mcp-call-graph/server.py", "mcp"] },
"coditect-impact-analysis": { "command": "python3", "args": ["...mcp-impact-analysis/server.py", "mcp"] },
"coditect-context-graph": { "command": "python3", "args": ["...mcp-context-graph/server.py", "mcp"] },
"coditect-cross-llm": { "command": "python3", "args": ["...mcp-cross-llm-bridge/server.py", "mcp"] },
"coditect-skill-server": { "command": "python3", "args": ["...mcp-skill-server/server.py", "mcp"] },
"coditect-backup": { "command": "python3", "args": ["...mcp-backup/server.py", "mcp"] },
"coditect-git-status": { "command": "python3", "args": ["...mcp-git-status/server.py", "mcp"] }
}
}
After (1 entry):
{
"mcpServers": {
"coditect-unified": {
"command": "python3",
"args": [
"~/.coditect/tools/mcp-unified-gateway/server.py",
"mcp"
]
}
}
}
Value Proposition
Configuration Simplicity
- 87.5% reduction in MCP configuration entries (8 → 1)
- Single path to maintain across machines
- New backends added transparently
Memory Efficiency
- Startup loads only tool metadata, not backend modules
- Typical session loads 2-3 backends out of 8
- Memory savings: ~40-60% compared to loading all backends
Developer Experience
python3 server.py list— see all 74+ tools in one placepython3 server.py stats— gateway-wide statistics- Error messages include backend context for debugging
How coditect-core Utilizes This Server
Primary Use Case
The unified gateway is the recommended MCP configuration for Claude Code users. It replaces the need to configure individual servers.
CLI Diagnostics
# List all available tools across all backends
python3 server.py list
# Show gateway statistics (tools per backend, loaded modules)
python3 server.py stats
# Run as MCP server (normal mode)
python3 server.py mcp
Health Monitoring
The gateway integrates with scripts/mcp-health-check.py:
- Individual backend health is checked via the gateway
- Gateway itself is monitored for startup and routing failures
Consequences
Positive
- P1: Dramatic simplification of MCP configuration
- P2: Lazy loading reduces memory footprint
- P3: Tool names unchanged — zero migration cost
- P4: New backends can be added with just a BackendConfig entry
Negative
- N1: Additional routing layer adds ~1ms latency per call
- N2: Backend loading errors surface at tool call time, not startup
- N3: Debugging requires understanding gateway → backend routing
- N4: Predefined tool fallbacks must be maintained manually
Dependencies
mcp>=0.9.0 # MCP Python SDK
# Backend dependencies loaded on-demand with each backend
Files
| File | Purpose |
|---|---|
tools/mcp-unified-gateway/server.py | Gateway implementation (505 lines) |
tools/mcp-unified-gateway/CLAUDE.md | Tool documentation |
Validation
- Single MCP endpoint serves all 74+ tools
- Lazy loading confirmed — backends load only on first call
- Tool names backward compatible (no namespace prefixes)
- 3 call strategies handle different backend patterns
- Error wrapping includes backend context
- CLI mode: list and stats commands functional
- Predefined tool fallbacks work when dynamic loading fails
Status: ACCEPTED Decision Date: 2026-02-05 Implementation Status: COMPLETE Task ID: J.4.11.4, H.12.3 Maintainer: CODITECT Core Team Review Date: 2026-05-05 (quarterly review)