ADR-204: MCP Unified Gateway — Single-Endpoint MCP Aggregator

Document: ADR-204-mcp-unified-gateway-single-endpoint
Version: 1.0.0
Purpose: Document architectural decision for unified MCP gateway aggregating 8 backend servers
Audience: Framework contributors, system architects
Date Created: 2026-02-05
Last Updated: 2026-02-15
Status: ACCEPTED
Task ID: J.4.11.4, H.12.3
Related ADRs: ADR-080, ADR-135, ADR-151, ADR-190, ADR-202, ADR-203
Related Documents:
  - tools/mcp-unified-gateway/server.py
  - tools/mcp-unified-gateway/CLAUDE.md

Context and Problem Statement

Background

CODITECT's MCP infrastructure grew to 8 independent servers providing 74+ tools:

Server	Tools	ADR
mcp-semantic-search	8	ADR-080
mcp-call-graph	7	ADR-202
mcp-impact-analysis	5	ADR-203
mcp-context-graph	16	ADR-151/154
mcp-cross-llm-bridge	17	ADR-190
mcp-skill-server	15	ADR-135
mcp-backup	5	ADR-184/185
mcp-git-status	5	ADR-205
Total	78

Each server requires its own entry in .mcp.json or ~/.claude/settings.json, creating configuration complexity and startup overhead.

Problem Statement

How do we simplify MCP configuration from 8 separate server entries to a single endpoint while maintaining modularity, performance, and backward compatibility?

Requirements

Must-Have:

Single MCP endpoint replacing 8 separate configurations
Backward-compatible tool names (no namespace prefixes)
Lazy loading — backends loaded only when their tools are first called
Dynamic routing — tool calls forwarded to correct backend

Should-Have:

CLI mode for diagnostics (list tools, show stats)
Error wrapping with backend context
Predefined tool fallbacks when dynamic loading fails

Nice-to-Have:

Backend health monitoring
Tool call analytics
Hot reload of backend modules

Decision Drivers

Configuration Complexity

C1: 8 MCP server entries in settings.json is unwieldy
C2: Each new MCP server adds another config entry
C3: Users must maintain correct paths for all 8 servers

Performance

P1: Loading all 8 backends at startup wastes memory
P2: Most sessions use only 2-3 backends
P3: Lazy loading keeps startup time under 1 second

Maintainability

M1: Individual servers remain independently testable
M2: New backends can be added without modifying existing code
M3: Backend failures should be isolated (one crash doesn't affect others)

Considered Options

Option A: Namespace-Based Aggregation

Description: Prefix tool names with backend name (e.g., call_graph.get_callers).

Pros: Clear tool provenance, no name collisions Cons: Breaking change — all existing tool references must be updated

Rejected: Breaking backward compatibility is unacceptable.

Option B: Single Gateway with Lazy Loading (SELECTED)

Description: Single MCP server that routes tool calls to lazily-loaded backend modules using a tool→backend registry.

Pros:

Single configuration entry
Backward-compatible tool names
Lazy loading minimizes memory
Backend isolation via independent module loading

Cons:

Additional routing layer
Must maintain tool→backend mapping

Selected: Best balance of simplicity, performance, and compatibility.

Option C: Process-Level Multiplexer

Description: Spawn backend servers as child processes and proxy MCP messages.

Pros: Maximum isolation (process boundaries) Cons: High overhead (8 Python processes), complex IPC, slower startup

Rejected: Excessive resource usage for single-user CLI tool.

Decision Outcome

CHOSEN: Option B — Single Gateway with Lazy Loading

Rationale

Simplicity: One config entry replaces eight
Performance: Lazy loading means only used backends consume memory
Compatibility: Tool names unchanged — zero migration effort
Extensibility: Adding a 9th backend requires only a BackendConfig entry

Technical Implementation

Architecture

┌─────────────────────────────────────────────────────────────────┐
│  Claude Code / MCP Client                                        │
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  Unified Gateway (tools/mcp-unified-gateway)             │   │
│  │                                                          │   │
│  │  ┌─────────────────┐   ┌──────────────────┐              │   │
│  │  │ _tool_registry   │   │ BackendLoader    │              │   │
│  │  │                  │   │                  │              │   │
│  │  │ hybrid_search    │   │ load_backend()   │              │   │
│  │  │  → semantic_srch │   │  importlib.util  │              │   │
│  │  │ get_callers      │   │  spec_from_file  │              │   │
│  │  │  → call_graph    │   │                  │              │   │
│  │  │ analyze_impact   │   │ _loaded_modules  │              │   │
│  │  │  → impact_anlys  │   │  {name: module}  │              │   │
│  │  │ ...74+ mappings  │   │                  │              │   │
│  │  └─────────────────┘   └──────────────────┘              │   │
│  └──────────────────────────────────────────────────────────┘   │
│       │           │           │           │           │         │
│       ▼           ▼           ▼           ▼           ▼         │
│  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐   │
│  │backup  │  │call_   │  │context │  │cross_  │  │semantic│   │
│  │(5)     │  │graph(7)│  │graph   │  │llm(17) │  │search  │   │
│  │        │  │        │  │(16)    │  │        │  │(8)     │   │
│  └────────┘  └────────┘  └────────┘  └────────┘  └────────┘   │
│  ┌────────┐  ┌────────┐  ┌────────┐                            │
│  │git_    │  │impact_ │  │skill_  │                            │
│  │status  │  │analysis│  │server  │                            │
│  │(5)     │  │(5)     │  │(15)    │                            │
│  └────────┘  └────────┘  └────────┘                            │
└─────────────────────────────────────────────────────────────────┘

Backend Configuration

BACKENDS = {
    "backup":         BackendConfig("backup",         "mcp-backup/server.py",         "", "Backup and restore", True),
    "call_graph":     BackendConfig("call_graph",     "mcp-call-graph/server.py",     "", "Call graph navigation", True),
    "context_graph":  BackendConfig("context_graph",  "mcp-context-graph/server.py",  "", "Context graph ops", True),
    "cross_llm":      BackendConfig("cross_llm",      "mcp-cross-llm-bridge/server.py","","Cross-LLM orchestration", True),
    "git_status":     BackendConfig("git_status",     "mcp-git-status/server.py",     "", "Git submodule status", True),
    "impact_analysis":BackendConfig("impact_analysis","mcp-impact-analysis/server.py","", "Impact analysis", True),
    "semantic_search":BackendConfig("semantic_search","mcp-semantic-search/server.py","", "Semantic search", True),
    "skill_server":   BackendConfig("skill_server",   "mcp-skill-server/server.py",   "", "Skill loading", True),
}

Lazy Loading Strategy

class BackendLoader:
    def __init__(self):
        self._loaded_modules = {}   # Only populated on first call
        self._tool_registry = {}    # tool_name → backend_name

    def load_backend(self, backend_name: str) -> Module:
        """Load backend module only on first tool call.

        Uses importlib.util.spec_from_file_location() for
        dynamic module loading without import side effects.
        """
        if backend_name in self._loaded_modules:
            return self._loaded_modules[backend_name]

        config = BACKENDS[backend_name]
        spec = importlib.util.spec_from_file_location(
            f"backend_{backend_name}",
            os.path.join(TOOLS_DIR, config.module_path)
        )
        module = importlib.util.module_from_spec(spec)
        spec.loader.exec_module(module)

        self._loaded_modules[backend_name] = module
        return module

Tool Call Routing

When a tool is called, the gateway:

Looks up tool_name in _tool_registry to find backend_name
Lazy-loads the backend module if not already loaded
Tries 3 call strategies in order:
- module.call_tool(tool_name, arguments) — standard MCP pattern
- module.handle_{tool_name}(arguments) — handler pattern
- module.{tool_name}(arguments) — direct function call
Returns result or wraps error with backend context

Tool Name Compatibility

Tools retain their original names for backward compatibility:

hybrid_search          (not semantic_search.hybrid_search)
get_callers            (not call_graph.get_callers)
analyze_impact         (not impact_analysis.analyze_impact)
backup_status          (not backup.backup_status)

This ensures existing Claude Code sessions, scripts, and documentation continue working.

MCP Configuration

Before (8 entries):

{
  "mcpServers": {
    "coditect-semantic-search": { "command": "python3", "args": ["...mcp-semantic-search/server.py", "mcp"] },
    "coditect-call-graph": { "command": "python3", "args": ["...mcp-call-graph/server.py", "mcp"] },
    "coditect-impact-analysis": { "command": "python3", "args": ["...mcp-impact-analysis/server.py", "mcp"] },
    "coditect-context-graph": { "command": "python3", "args": ["...mcp-context-graph/server.py", "mcp"] },
    "coditect-cross-llm": { "command": "python3", "args": ["...mcp-cross-llm-bridge/server.py", "mcp"] },
    "coditect-skill-server": { "command": "python3", "args": ["...mcp-skill-server/server.py", "mcp"] },
    "coditect-backup": { "command": "python3", "args": ["...mcp-backup/server.py", "mcp"] },
    "coditect-git-status": { "command": "python3", "args": ["...mcp-git-status/server.py", "mcp"] }
  }
}

After (1 entry):

{
  "mcpServers": {
    "coditect-unified": {
      "command": "python3",
      "args": [
        "~/.coditect/tools/mcp-unified-gateway/server.py",
        "mcp"
      ]
    }
  }
}

Value Proposition

Configuration Simplicity

87.5% reduction in MCP configuration entries (8 → 1)
Single path to maintain across machines
New backends added transparently

Memory Efficiency

Startup loads only tool metadata, not backend modules
Typical session loads 2-3 backends out of 8
Memory savings: ~40-60% compared to loading all backends

Developer Experience

python3 server.py list — see all 74+ tools in one place
python3 server.py stats — gateway-wide statistics
Error messages include backend context for debugging

How coditect-core Utilizes This Server

Primary Use Case

The unified gateway is the recommended MCP configuration for Claude Code users. It replaces the need to configure individual servers.

CLI Diagnostics

# List all available tools across all backends
python3 server.py list

# Show gateway statistics (tools per backend, loaded modules)
python3 server.py stats

# Run as MCP server (normal mode)
python3 server.py mcp

Health Monitoring

The gateway integrates with scripts/mcp-health-check.py:

Individual backend health is checked via the gateway
Gateway itself is monitored for startup and routing failures

Consequences

Positive

P1: Dramatic simplification of MCP configuration
P2: Lazy loading reduces memory footprint
P3: Tool names unchanged — zero migration cost
P4: New backends can be added with just a BackendConfig entry

Negative

N1: Additional routing layer adds ~1ms latency per call
N2: Backend loading errors surface at tool call time, not startup
N3: Debugging requires understanding gateway → backend routing
N4: Predefined tool fallbacks must be maintained manually

Dependencies

mcp>=0.9.0              # MCP Python SDK
# Backend dependencies loaded on-demand with each backend

Files

File	Purpose
`tools/mcp-unified-gateway/server.py`	Gateway implementation (505 lines)
`tools/mcp-unified-gateway/CLAUDE.md`	Tool documentation

Validation

Single MCP endpoint serves all 74+ tools
Lazy loading confirmed — backends load only on first call
Tool names backward compatible (no namespace prefixes)
3 call strategies handle different backend patterns
Error wrapping includes backend context
CLI mode: list and stats commands functional
Predefined tool fallbacks work when dynamic loading fails

Status: ACCEPTED Decision Date: 2026-02-05 Implementation Status: COMPLETE Task ID: J.4.11.4, H.12.3 Maintainer: CODITECT Core Team Review Date: 2026-05-05 (quarterly review)

Context and Problem Statement​

Background​

Problem Statement​

Requirements​

Decision Drivers​

Configuration Complexity​

Performance​

Maintainability​

Considered Options​

Option A: Namespace-Based Aggregation​

Option B: Single Gateway with Lazy Loading (SELECTED)​

Option C: Process-Level Multiplexer​

Decision Outcome​

Rationale​

Technical Implementation​

Architecture​

Backend Configuration​

Lazy Loading Strategy​

Tool Call Routing​

Tool Name Compatibility​

MCP Configuration​

Value Proposition​

Configuration Simplicity​

Memory Efficiency​

Developer Experience​

How coditect-core Utilizes This Server​

Primary Use Case​

CLI Diagnostics​

Health Monitoring​

Consequences​

Positive​

Negative​

Dependencies​

Files​

Validation​