ADR-019: MCP Protocol Backend Architecture

Status: Accepted Date: 2025-10-06 Deciders: Development Team, AI/ML Team Related: ADR-010 (MCP Protocol), ADR-017 (WebSocket), ADR-015 (Multi-llm)

Context

The AZ1.AI llm IDE uses the Model Context Protocol (MCP) created by Anthropic to enable llms to access tools, resources, and prompts in a standardized way.

MCP Overview

MCP defines three core primitives:

Tools (Model-controlled):
- llm decides when to invoke
- Examples: lmstudio_chat, file_read, web_search
Resources (App-controlled):
- Application provides context
- Examples: file:///workspace/src/main.ts, session://current
Prompts (User-controlled):
- User-triggered templates
- Examples: code-review, explain-code, refactor

Current State

MCP LM Studio server implemented (mcp-lmstudio/)
LM Studio models accessible via MCP
No integration with other llm providers
No unified MCP backend for all services
Client-side MCP integration incomplete

Requirements

Multi-Provider: Support all 6 llm providers via MCP
Tool Discovery: Dynamic tool registration and discovery
Resource Management: Unified resource access (files, sessions, agents)
Prompt Templates: User-customizable prompt library
Session Isolation: Per-session MCP context
Streaming: Support streaming responses
Caching: Cache tool results, resources

Decision

We will implement a unified MCP backend that acts as a gateway for all llm providers and services:

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Browser (theia Frontend)                 │
│                                                             │
│  ┌───────────────────────────────────────────────────────┐ │
│  │               MCP Client                              │ │
│  │  ┌──────────────────────────────────────────────────┐ │ │
│  │  │  Tool Registry  │  Resource Cache  │  Prompts   │ │ │
│  │  └──────────────────────────────────────────────────┘ │ │
│  └───────────────────────────────────────────────────────┘ │
│                         │                                   │
│                         │ WebSocket (mcp/* methods)        │
└─────────────────────────┼───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                  MCP Gateway (Node.js)                      │
│                                                             │
│  ┌───────────────────────────────────────────────────────┐ │
│  │           MCP Protocol Handler                        │ │
│  │  ┌──────────────────────────────────────────────────┐ │ │
│  │  │ tools/list │ tools/call │ resources/list │ etc.  │ │ │
│  │  └──────────────────────────────────────────────────┘ │ │
│  └───────────────────────────────────────────────────────┘ │
│                         │                                   │
│         ┌───────────────┼───────────────┐                  │
│         │               │               │                  │
│         ▼               ▼               ▼                  │
│  ┌──────────┐   ┌─────────────┐  ┌────────────┐          │
│  │   Tool   │   │  Resource   │  │   Prompt   │          │
│  │ Providers│   │  Providers  │  │  Registry  │          │
│  └──────────┘   └─────────────┘  └────────────┘          │
│         │               │               │                  │
│         │               │               │                  │
└─────────┼───────────────┼───────────────┼──────────────────┘
          │               │               │
          ▼               ▼               ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│ llm Providers│  │  Filesystem  │  │FoundationDB │
│ (6 providers)│  │   Service    │  │   Service    │
└──────────────┘  └──────────────┘  └──────────────┘

MCP Server Components

mcp-gateway/
├── index.ts                 # Main MCP server
├── tools/                   # Tool providers
│   ├── llm-tools.ts        # llm chat tools (6 providers)
│   ├── filesystem-tools.ts # File operations
│   ├── agent-tools.ts      # Agent execution
│   └── web-tools.ts        # Web search, fetch
├── resources/               # Resource providers
│   ├── file-resources.ts   # File content access
│   ├── session-resources.ts # Session data
│   └── agent-resources.ts  # Agent state
├── prompts/                 # Prompt templates
│   ├── code-review.ts
│   ├── explain-code.ts
│   └── refactor.ts
└── services/                # Backend services
    ├── cache-service.ts    # Result caching
    ├── session-service.ts  # Session management
    └── auth-service.ts     # Authentication

Implementation

1. MCP Gateway Server

// mcp-gateway/index.ts

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
  ListResourcesRequestSchema,
  ReadResourceRequestSchema,
  ListPromptsRequestSchema,
  GetPromptRequestSchema
} from '@modelcontextprotocol/sdk/types.js';

// Tool providers
import { llmTools } from './tools/llm-tools.js';
import { FilesystemTools } from './tools/filesystem-tools.js';
import { AgentTools } from './tools/agent-tools.js';

// Resource providers
import { FileResources } from './resources/file-resources.js';
import { SessionResources } from './resources/session-resources.js';

// Prompt registry
import { PromptRegistry } from './prompts/index.js';

// Services
import { CacheService } from './services/cache-service.js';
import { SessionService } from './services/session-service.js';

class MCPGateway {
  private server: Server;
  private llmTools: llmTools;
  private filesystemTools: FilesystemTools;
  private agentTools: AgentTools;
  private fileResources: FileResources;
  private sessionResources: SessionResources;
  private promptRegistry: PromptRegistry;
  private cacheService: CacheService;
  private sessionService: SessionService;

  constructor() {
    this.server = new Server(
      {
        name: 'az1ai-mcp-gateway',
        version: '1.0.0',
      },
      {
        capabilities: {
          tools: {},
          resources: {},
          prompts: {},
        },
      }
    );

    // Initialize services
    this.cacheService = new CacheService();
    this.sessionService = new SessionService();

    // Initialize providers
    this.llmTools = new llmTools(this.cacheService);
    this.filesystemTools = new FilesystemTools();
    this.agentTools = new AgentTools(this.sessionService);
    this.fileResources = new FileResources();
    this.sessionResources = new SessionResources(this.sessionService);
    this.promptRegistry = new PromptRegistry();

    this.setupHandlers();
  }

  private setupHandlers() {
    // Tools
    this.server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        ...this.llmTools.listTools(),
        ...this.filesystemTools.listTools(),
        ...this.agentTools.listTools(),
      ],
    }));

    this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
      const { name, arguments: args } = request.params;

      // Route to appropriate tool provider
      if (name.startsWith('llm_')) {
        return this.llmTools.callTool(name, args);
      } else if (name.startsWith('fs_')) {
        return this.filesystemTools.callTool(name, args);
      } else if (name.startsWith('agent_')) {
        return this.agentTools.callTool(name, args);
      }

      throw new Error(`Unknown tool: ${name}`);
    });

    // Resources
    this.server.setRequestHandler(ListResourcesRequestSchema, async () => ({
      resources: [
        ...this.fileResources.listResources(),
        ...this.sessionResources.listResources(),
      ],
    }));

    this.server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
      const { uri } = request.params;

      if (uri.startsWith('file://')) {
        return this.fileResources.readResource(uri);
      } else if (uri.startsWith('session://')) {
        return this.sessionResources.readResource(uri);
      }

      throw new Error(`Unknown resource scheme: ${uri}`);
    });

    // Prompts
    this.server.setRequestHandler(ListPromptsRequestSchema, async () => ({
      prompts: this.promptRegistry.listPrompts(),
    }));

    this.server.setRequestHandler(GetPromptRequestSchema, async (request) => {
      const { name, arguments: args } = request.params;
      return this.promptRegistry.getPrompt(name, args);
    });
  }

  async run() {
    const transport = new StdioServerTransport();
    await this.server.connect(transport);
    console.error('MCP Gateway server running on stdio');
  }
}

// Start server
const gateway = new MCPGateway();
gateway.run().catch(console.error);

2. llm Tool Provider

// mcp-gateway/tools/llm-tools.ts

import { Tool } from '@modelcontextprotocol/sdk/types.js';
import { llmService } from '../../src/browser/llm-integration/services/llm-service.js';

export class llmTools {
  private llmService: llmService;
  private cacheService: CacheService;

  constructor(cacheService: CacheService) {
    this.llmService = new llmService();
    this.cacheService = cacheService;
  }

  listTools(): Tool[] {
    return [
      {
        name: 'llm_chat',
        description: 'Send a chat message to any llm provider',
        inputSchema: {
          type: 'object',
          properties: {
            provider: {
              type: 'string',
              enum: ['lmstudio', 'claude', 'ollama', 'openai', 'gemini', 'grok'],
              description: 'llm provider to use'
            },
            model: {
              type: 'string',
              description: 'Model ID (e.g., "meta-llama-3.3-70b-instruct")'
            },
            messages: {
              type: 'array',
              items: {
                type: 'object',
                properties: {
                  role: { type: 'string', enum: ['user', 'assistant', 'system'] },
                  content: { type: 'string' }
                },
                required: ['role', 'content']
              },
              description: 'Conversation messages'
            },
            temperature: {
              type: 'number',
              minimum: 0,
              maximum: 2,
              default: 0.7,
              description: 'Sampling temperature'
            },
            maxTokens: {
              type: 'number',
              default: 1000,
              description: 'Maximum tokens to generate'
            }
          },
          required: ['provider', 'model', 'messages']
        }
      },
      {
        name: 'llm_list_models',
        description: 'List available models for a provider',
        inputSchema: {
          type: 'object',
          properties: {
            provider: {
              type: 'string',
              enum: ['lmstudio', 'claude', 'ollama', 'openai', 'gemini', 'grok'],
              description: 'llm provider'
            }
          },
          required: ['provider']
        }
      },
      {
        name: 'llm_parallel',
        description: 'Run same prompt on multiple models in parallel',
        inputSchema: {
          type: 'object',
          properties: {
            models: {
              type: 'array',
              items: { type: 'string' },
              description: 'Model IDs to query'
            },
            messages: {
              type: 'array',
              items: {
                type: 'object',
                properties: {
                  role: { type: 'string' },
                  content: { type: 'string' }
                }
              }
            }
          },
          required: ['models', 'messages']
        }
      },
      {
        name: 'llm_consensus',
        description: 'Get consensus answer from multiple models',
        inputSchema: {
          type: 'object',
          properties: {
            models: {
              type: 'array',
              items: { type: 'string' },
              minItems: 3,
              description: 'At least 3 models for consensus'
            },
            prompt: {
              type: 'string',
              description: 'Prompt to send to all models'
            }
          },
          required: ['models', 'prompt']
        }
      }
    ];
  }

  async callTool(name: string, args: any): Promise<any> {
    switch (name) {
      case 'llm_chat':
        return this.chatCompletion(args);

      case 'llm_list_models':
        return this.listModels(args.provider);

      case 'llm_parallel':
        return this.parallelCompletion(args);

      case 'llm_consensus':
        return this.consensusCompletion(args);

      default:
        throw new Error(`Unknown llm tool: ${name}`);
    }
  }

  private async chatCompletion(args: any) {
    const { provider, model, messages, temperature, maxTokens } = args;

    // Check cache first
    const cacheKey = `llm:${provider}:${model}:${JSON.stringify(messages)}`;
    const cached = await this.cacheService.get(cacheKey);
    if (cached) {
      return {
        content: [{
          type: 'text',
          text: cached.response,
          annotations: { cached: true }
        }]
      };
    }

    // Call llm service
    const response = await this.llmService.chatCompletion(
      model,
      messages,
      temperature,
      maxTokens
    );

    // Cache result
    await this.cacheService.set(cacheKey, { response }, 3600); // 1 hour TTL

    return {
      content: [{
        type: 'text',
        text: response
      }]
    };
  }

  private async listModels(provider: string) {
    const models = await this.llmService.getAvailableModels();
    const filtered = models.filter(m => m.provider === provider);

    return {
      content: [{
        type: 'text',
        text: JSON.stringify(filtered, null, 2)
      }]
    };
  }

  private async parallelCompletion(args: any) {
    const { models, messages } = args;

    const results = await Promise.all(
      models.map(model =>
        this.llmService.chatCompletion(model, messages)
      )
    );

    return {
      content: [{
        type: 'text',
        text: JSON.stringify(
          results.map((result, i) => ({
            model: models[i],
            response: result
          })),
          null,
          2
        )
      }]
    };
  }

  private async consensusCompletion(args: any) {
    const { models, prompt } = args;

    const messages = [{ role: 'user', content: prompt }];
    const results = await Promise.all(
      models.map(model =>
        this.llmService.chatCompletion(model, messages)
      )
    );

    // Simple consensus: most common response
    const responseCounts = new Map<string, number>();
    for (const result of results) {
      responseCounts.set(result, (responseCounts.get(result) || 0) + 1);
    }

    let consensus = '';
    let maxCount = 0;
    for (const [response, count] of responseCounts.entries()) {
      if (count > maxCount) {
        consensus = response;
        maxCount = count;
      }
    }

    return {
      content: [{
        type: 'text',
        text: consensus,
        annotations: {
          agreementCount: maxCount,
          totalModels: models.length,
          allResponses: results
        }
      }]
    };
  }
}

3. Resource Providers

// mcp-gateway/resources/file-resources.ts

import { Resource } from '@modelcontextprotocol/sdk/types.js';
import { promises as fs } from 'fs';
import path from 'path';

export class FileResources {
  private workspaceRoot: string;

  constructor(workspaceRoot = '/workspace') {
    this.workspaceRoot = workspaceRoot;
  }

  listResources(): Resource[] {
    // In production, dynamically list workspace files
    return [
      {
        uri: 'file:///workspace/src/main.ts',
        name: 'main.ts',
        description: 'Application entry point',
        mimeType: 'text/typescript'
      },
      {
        uri: 'file:///workspace/package.json',
        name: 'package.json',
        description: 'Package configuration',
        mimeType: 'application/json'
      }
    ];
  }

  async readResource(uri: string) {
    // Parse file:// URI
    const filePath = uri.replace('file://', '');
    const fullPath = path.resolve(this.workspaceRoot, filePath);

    // Security: prevent path traversal
    if (!fullPath.startsWith(this.workspaceRoot)) {
      throw new Error('Invalid file path');
    }

    const content = await fs.readFile(fullPath, 'utf-8');
    const mimeType = this.getMimeType(filePath);

    return {
      contents: [{
        uri,
        mimeType,
        text: content
      }]
    };
  }

  private getMimeType(filePath: string): string {
    const ext = path.extname(filePath);
    const mimeTypes: Record<string, string> = {
      '.ts': 'text/typescript',
      '.tsx': 'text/typescript',
      '.js': 'text/javascript',
      '.jsx': 'text/javascript',
      '.json': 'application/json',
      '.md': 'text/markdown',
      '.txt': 'text/plain'
    };
    return mimeTypes[ext] || 'text/plain';
  }
}

4. Prompt Registry

// mcp-gateway/prompts/index.ts

import { Prompt } from '@modelcontextprotocol/sdk/types.js';

export class PromptRegistry {
  private prompts: Map<string, Prompt> = new Map();

  constructor() {
    this.registerDefaultPrompts();
  }

  private registerDefaultPrompts() {
    this.prompts.set('code-review', {
      name: 'code-review',
      description: 'Review code for quality, security, and best practices',
      arguments: [
        {
          name: 'code',
          description: 'Code to review',
          required: true
        },
        {
          name: 'level',
          description: 'Review level: basic, intermediate, advanced',
          required: false
        }
      ]
    });

    this.prompts.set('explain-code', {
      name: 'explain-code',
      description: 'Explain how code works',
      arguments: [
        {
          name: 'code',
          description: 'Code to explain',
          required: true
        },
        {
          name: 'audience',
          description: 'Target audience: beginner, intermediate, expert',
          required: false
        }
      ]
    });

    this.prompts.set('refactor', {
      name: 'refactor',
      description: 'Suggest refactoring improvements',
      arguments: [
        {
          name: 'code',
          description: 'Code to refactor',
          required: true
        },
        {
          name: 'goals',
          description: 'Refactoring goals: performance, readability, maintainability',
          required: false
        }
      ]
    });
  }

  listPrompts(): Prompt[] {
    return Array.from(this.prompts.values());
  }

  getPrompt(name: string, args?: any) {
    const prompt = this.prompts.get(name);
    if (!prompt) {
      throw new Error(`Prompt not found: ${name}`);
    }

    let messages: Array<{ role: string; content: { type: string; text: string } }> = [];

    switch (name) {
      case 'code-review':
        messages = [{
          role: 'user',
          content: {
            type: 'text',
            text: `Review the following code (${args?.level || 'intermediate'} level):\n\n${args.code}\n\nProvide feedback on:\n- Code quality\n- Security issues\n- Best practices\n- Potential bugs`
          }
        }];
        break;

      case 'explain-code':
        messages = [{
          role: 'user',
          content: {
            type: 'text',
            text: `Explain how this code works for a ${args?.audience || 'intermediate'} developer:\n\n${args.code}`
          }
        }];
        break;

      case 'refactor':
        messages = [{
          role: 'user',
          content: {
            type: 'text',
            text: `Suggest refactoring improvements for this code (goals: ${args?.goals || 'general improvement'}):\n\n${args.code}\n\nProvide:\n1. Specific issues\n2. Suggested changes\n3. Refactored code`
          }
        }];
        break;
    }

    return { messages };
  }
}

5. Cache Service

// mcp-gateway/services/cache-service.ts

import { createClient } from 'redis';

export class CacheService {
  private client: any;

  constructor() {
    this.client = createClient({
      url: process.env.REDIS_URL || 'redis://localhost:6379'
    });
    this.client.connect();
  }

  async get(key: string): Promise<any | null> {
    const value = await this.client.get(key);
    return value ? JSON.parse(value) : null;
  }

  async set(key: string, value: any, ttl = 3600): Promise<void> {
    await this.client.setEx(key, ttl, JSON.stringify(value));
  }

  async delete(key: string): Promise<void> {
    await this.client.del(key);
  }

  async clear(): Promise<void> {
    await this.client.flushAll();
  }
}

Rationale

Why Unified MCP Gateway?

Single Integration Point:

✅ All llm providers accessible via one interface
✅ Consistent tool/resource/prompt access
✅ Easier to add new providers

Caching & Optimization:

✅ Cache llm responses (save API costs)
✅ Batch similar requests
✅ Rate limiting

Security & Control:

✅ Centralized auth/permission checks
✅ Request logging/auditing
✅ Resource access control

Why Separate Tool Providers?

Modularity:

✅ Easy to add/remove tool providers
✅ Each provider focused on one domain
✅ Testable in isolation

Performance:

✅ Lazy-load providers as needed
✅ Parallel tool execution
✅ Provider-specific optimizations

Why Redis for Caching?

Performance:

✅ In-memory, sub-millisecond latency
✅ TTL support (auto-expiration)
✅ Pub/sub for cache invalidation

Scalability:

✅ Horizontal scaling (Redis Cluster)
✅ High throughput (100K+ ops/sec)
✅ Low memory overhead

Alternatives Considered

Alternative 1: Direct MCP Servers per Provider

Pros:

Simple (no gateway)
Provider isolation

Cons:

❌ Client must manage multiple servers
❌ No unified caching
❌ Duplicate auth/logging

Rejected: Too complex for client

Alternative 2: HTTP REST API

Pros:

Simple HTTP requests
Wide tooling support

Cons:

❌ Not MCP-compliant
❌ No standard for tools/resources
❌ Less efficient than stdio/WebSocket

Rejected: MCP is the standard

Alternative 3: In-Memory Cache (No Redis)

Pros:

Simple (no external dependency)
Fast (in-process)

Cons:

❌ Lost on restart
❌ Can't scale horizontally
❌ Memory limited

Rejected: Need persistence and scaling

Consequences

Positive

✅ Unified Interface: Single MCP gateway for all providers ✅ Efficient: Caching reduces API costs and latency ✅ Extensible: Easy to add new tools/resources/prompts ✅ Secure: Centralized auth and access control ✅ Standard: MCP-compliant (works with any MCP client) ✅ Scalable: Redis caching, horizontal scaling

Negative

❌ Complexity: Extra layer between client and providers ❌ Single Point of Failure: Gateway down = no MCP access ❌ Latency: Extra hop adds ~10-20ms ❌ Cache Invalidation: Need strategy for stale data

Mitigation

Complexity:

Comprehensive documentation
Example integrations
Testing framework

SPOF:

Deploy multiple gateway instances
Health checks + auto-restart
Fallback to direct provider access

Latency:

Optimize gateway code
Use local Redis (same machine)
Monitor and profile

Cache Invalidation:

TTL on all cache entries
Manual invalidation API
Cache versioning

Implementation Plan

Phase 1: Core Gateway ✅

MCP server setup
Tool/resource/prompt handlers
Basic llm tool provider

Phase 2: Multi-Provider Support 🔲

All 6 llm providers integrated
Provider-specific optimizations
Model listing per provider

Phase 3: Resource Providers 🔲

File resources
Session resources
Agent resources

Phase 4: Caching 🔲

Redis integration
Cache key strategy
TTL configuration
Cache invalidation API

Phase 5: Advanced Features 🔲

Prompt library (user-customizable)
Parallel/consensus tools
Streaming responses
Rate limiting

Phase 6: Production Hardening 🔲

Monitoring (Prometheus)
Logging (structured)
Error tracking (Sentry)
Performance testing

Success Metrics

Performance:

< 50ms gateway overhead
> 80% cache hit rate (for repeated queries)
1000+ requests/second throughput

Reliability:

99.9% uptime
< 0.1% error rate
Graceful degradation on provider failures

Developer Experience:

< 5 minutes to add new tool
Clear error messages
Comprehensive examples

ADR-010: MCP Protocol - Original MCP decision
ADR-015: Multi-llm Providers - Provider support
ADR-017: WebSocket Backend - Transport layer

References

MCP Specification:

Redis:

Status: ✅ Accepted Next Review: 2025-11-06 (1 month) Last Updated: 2025-10-06

Context​

MCP Overview​

Current State​

Requirements​

Decision​

Architecture​

MCP Server Components​

Implementation​

1. MCP Gateway Server​

2. llm Tool Provider​

3. Resource Providers​

4. Prompt Registry​

5. Cache Service​

Rationale​

Why Unified MCP Gateway?​

Why Separate Tool Providers?​

Why Redis for Caching?​

Alternatives Considered​

Alternative 1: Direct MCP Servers per Provider​

Alternative 2: HTTP REST API​

Alternative 3: In-Memory Cache (No Redis)​

Consequences​

Positive​

Negative​

Mitigation​

Implementation Plan​

Phase 1: Core Gateway ✅​

Phase 2: Multi-Provider Support 🔲​

Phase 3: Resource Providers 🔲​

Phase 4: Caching 🔲​

Phase 5: Advanced Features 🔲​

Phase 6: Production Hardening 🔲​

Success Metrics​

Related Decisions​

References​

Context

MCP Overview

Current State

Requirements

Decision

Architecture

MCP Server Components

Implementation

1. MCP Gateway Server

2. llm Tool Provider

3. Resource Providers

4. Prompt Registry

5. Cache Service

Rationale

Why Unified MCP Gateway?

Why Separate Tool Providers?

Why Redis for Caching?

Alternatives Considered

Alternative 1: Direct MCP Servers per Provider

Alternative 2: HTTP REST API

Alternative 3: In-Memory Cache (No Redis)

Consequences

Positive

Negative

Mitigation

Implementation Plan

Phase 1: Core Gateway ✅

Phase 2: Multi-Provider Support 🔲

Phase 3: Resource Providers 🔲

Phase 4: Caching 🔲

Phase 5: Advanced Features 🔲

Phase 6: Production Hardening 🔲

Success Metrics

Related Decisions

References