ADR-019: MCP Protocol Backend Architecture
Status: Accepted Date: 2025-10-06 Deciders: Development Team, AI/ML Team Related: ADR-010 (MCP Protocol), ADR-017 (WebSocket), ADR-015 (Multi-llm)
Context
The AZ1.AI llm IDE uses the Model Context Protocol (MCP) created by Anthropic to enable llms to access tools, resources, and prompts in a standardized way.
MCP Overview
MCP defines three core primitives:
-
Tools (Model-controlled):
- llm decides when to invoke
- Examples:
lmstudio_chat,file_read,web_search
-
Resources (App-controlled):
- Application provides context
- Examples:
file:///workspace/src/main.ts,session://current
-
Prompts (User-controlled):
- User-triggered templates
- Examples:
code-review,explain-code,refactor
Current State
- MCP LM Studio server implemented (
mcp-lmstudio/) - LM Studio models accessible via MCP
- No integration with other llm providers
- No unified MCP backend for all services
- Client-side MCP integration incomplete
Requirements
- Multi-Provider: Support all 6 llm providers via MCP
- Tool Discovery: Dynamic tool registration and discovery
- Resource Management: Unified resource access (files, sessions, agents)
- Prompt Templates: User-customizable prompt library
- Session Isolation: Per-session MCP context
- Streaming: Support streaming responses
- Caching: Cache tool results, resources
Decision
We will implement a unified MCP backend that acts as a gateway for all llm providers and services:
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Browser (theia Frontend) │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ MCP Client │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ Tool Registry │ Resource Cache │ Prompts │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │ │
│ │ WebSocket (mcp/* methods) │
└─────────────────────────┼───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ MCP Gateway (Node.js) │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ MCP Protocol Handler │ │
│ │ ┌──────────────────────────────────────────────────┐ │ │
│ │ │ tools/list │ tools/call │ resources/list │ etc. │ │ │
│ │ └──────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌─────────────┐ ┌────────────┐ │
│ │ Tool │ │ Resource │ │ Prompt │ │
│ │ Providers│ │ Providers │ │ Registry │ │
│ └──────────┘ └─────────────┘ └────────────┘ │
│ │ │ │ │
│ │ │ │ │
└─────────┼───────────────┼───────────────┼──────────────────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ llm Providers│ │ Filesystem │ │FoundationDB │
│ (6 providers)│ │ Service │ │ Service │
└──────────────┘ └──────────────┘ └──────────────┘
MCP Server Components
mcp-gateway/
├── index.ts # Main MCP server
├── tools/ # Tool providers
│ ├── llm-tools.ts # llm chat tools (6 providers)
│ ├── filesystem-tools.ts # File operations
│ ├── agent-tools.ts # Agent execution
│ └── web-tools.ts # Web search, fetch
├── resources/ # Resource providers
│ ├── file-resources.ts # File content access
│ ├── session-resources.ts # Session data
│ └── agent-resources.ts # Agent state
├── prompts/ # Prompt templates
│ ├── code-review.ts
│ ├── explain-code.ts
│ └── refactor.ts
└── services/ # Backend services
├── cache-service.ts # Result caching
├── session-service.ts # Session management
└── auth-service.ts # Authentication
Implementation
1. MCP Gateway Server
// mcp-gateway/index.ts
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
ListResourcesRequestSchema,
ReadResourceRequestSchema,
ListPromptsRequestSchema,
GetPromptRequestSchema
} from '@modelcontextprotocol/sdk/types.js';
// Tool providers
import { llmTools } from './tools/llm-tools.js';
import { FilesystemTools } from './tools/filesystem-tools.js';
import { AgentTools } from './tools/agent-tools.js';
// Resource providers
import { FileResources } from './resources/file-resources.js';
import { SessionResources } from './resources/session-resources.js';
// Prompt registry
import { PromptRegistry } from './prompts/index.js';
// Services
import { CacheService } from './services/cache-service.js';
import { SessionService } from './services/session-service.js';
class MCPGateway {
private server: Server;
private llmTools: llmTools;
private filesystemTools: FilesystemTools;
private agentTools: AgentTools;
private fileResources: FileResources;
private sessionResources: SessionResources;
private promptRegistry: PromptRegistry;
private cacheService: CacheService;
private sessionService: SessionService;
constructor() {
this.server = new Server(
{
name: 'az1ai-mcp-gateway',
version: '1.0.0',
},
{
capabilities: {
tools: {},
resources: {},
prompts: {},
},
}
);
// Initialize services
this.cacheService = new CacheService();
this.sessionService = new SessionService();
// Initialize providers
this.llmTools = new llmTools(this.cacheService);
this.filesystemTools = new FilesystemTools();
this.agentTools = new AgentTools(this.sessionService);
this.fileResources = new FileResources();
this.sessionResources = new SessionResources(this.sessionService);
this.promptRegistry = new PromptRegistry();
this.setupHandlers();
}
private setupHandlers() {
// Tools
this.server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [
...this.llmTools.listTools(),
...this.filesystemTools.listTools(),
...this.agentTools.listTools(),
],
}));
this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
// Route to appropriate tool provider
if (name.startsWith('llm_')) {
return this.llmTools.callTool(name, args);
} else if (name.startsWith('fs_')) {
return this.filesystemTools.callTool(name, args);
} else if (name.startsWith('agent_')) {
return this.agentTools.callTool(name, args);
}
throw new Error(`Unknown tool: ${name}`);
});
// Resources
this.server.setRequestHandler(ListResourcesRequestSchema, async () => ({
resources: [
...this.fileResources.listResources(),
...this.sessionResources.listResources(),
],
}));
this.server.setRequestHandler(ReadResourceRequestSchema, async (request) => {
const { uri } = request.params;
if (uri.startsWith('file://')) {
return this.fileResources.readResource(uri);
} else if (uri.startsWith('session://')) {
return this.sessionResources.readResource(uri);
}
throw new Error(`Unknown resource scheme: ${uri}`);
});
// Prompts
this.server.setRequestHandler(ListPromptsRequestSchema, async () => ({
prompts: this.promptRegistry.listPrompts(),
}));
this.server.setRequestHandler(GetPromptRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
return this.promptRegistry.getPrompt(name, args);
});
}
async run() {
const transport = new StdioServerTransport();
await this.server.connect(transport);
console.error('MCP Gateway server running on stdio');
}
}
// Start server
const gateway = new MCPGateway();
gateway.run().catch(console.error);
2. llm Tool Provider
// mcp-gateway/tools/llm-tools.ts
import { Tool } from '@modelcontextprotocol/sdk/types.js';
import { llmService } from '../../src/browser/llm-integration/services/llm-service.js';
export class llmTools {
private llmService: llmService;
private cacheService: CacheService;
constructor(cacheService: CacheService) {
this.llmService = new llmService();
this.cacheService = cacheService;
}
listTools(): Tool[] {
return [
{
name: 'llm_chat',
description: 'Send a chat message to any llm provider',
inputSchema: {
type: 'object',
properties: {
provider: {
type: 'string',
enum: ['lmstudio', 'claude', 'ollama', 'openai', 'gemini', 'grok'],
description: 'llm provider to use'
},
model: {
type: 'string',
description: 'Model ID (e.g., "meta-llama-3.3-70b-instruct")'
},
messages: {
type: 'array',
items: {
type: 'object',
properties: {
role: { type: 'string', enum: ['user', 'assistant', 'system'] },
content: { type: 'string' }
},
required: ['role', 'content']
},
description: 'Conversation messages'
},
temperature: {
type: 'number',
minimum: 0,
maximum: 2,
default: 0.7,
description: 'Sampling temperature'
},
maxTokens: {
type: 'number',
default: 1000,
description: 'Maximum tokens to generate'
}
},
required: ['provider', 'model', 'messages']
}
},
{
name: 'llm_list_models',
description: 'List available models for a provider',
inputSchema: {
type: 'object',
properties: {
provider: {
type: 'string',
enum: ['lmstudio', 'claude', 'ollama', 'openai', 'gemini', 'grok'],
description: 'llm provider'
}
},
required: ['provider']
}
},
{
name: 'llm_parallel',
description: 'Run same prompt on multiple models in parallel',
inputSchema: {
type: 'object',
properties: {
models: {
type: 'array',
items: { type: 'string' },
description: 'Model IDs to query'
},
messages: {
type: 'array',
items: {
type: 'object',
properties: {
role: { type: 'string' },
content: { type: 'string' }
}
}
}
},
required: ['models', 'messages']
}
},
{
name: 'llm_consensus',
description: 'Get consensus answer from multiple models',
inputSchema: {
type: 'object',
properties: {
models: {
type: 'array',
items: { type: 'string' },
minItems: 3,
description: 'At least 3 models for consensus'
},
prompt: {
type: 'string',
description: 'Prompt to send to all models'
}
},
required: ['models', 'prompt']
}
}
];
}
async callTool(name: string, args: any): Promise<any> {
switch (name) {
case 'llm_chat':
return this.chatCompletion(args);
case 'llm_list_models':
return this.listModels(args.provider);
case 'llm_parallel':
return this.parallelCompletion(args);
case 'llm_consensus':
return this.consensusCompletion(args);
default:
throw new Error(`Unknown llm tool: ${name}`);
}
}
private async chatCompletion(args: any) {
const { provider, model, messages, temperature, maxTokens } = args;
// Check cache first
const cacheKey = `llm:${provider}:${model}:${JSON.stringify(messages)}`;
const cached = await this.cacheService.get(cacheKey);
if (cached) {
return {
content: [{
type: 'text',
text: cached.response,
annotations: { cached: true }
}]
};
}
// Call llm service
const response = await this.llmService.chatCompletion(
model,
messages,
temperature,
maxTokens
);
// Cache result
await this.cacheService.set(cacheKey, { response }, 3600); // 1 hour TTL
return {
content: [{
type: 'text',
text: response
}]
};
}
private async listModels(provider: string) {
const models = await this.llmService.getAvailableModels();
const filtered = models.filter(m => m.provider === provider);
return {
content: [{
type: 'text',
text: JSON.stringify(filtered, null, 2)
}]
};
}
private async parallelCompletion(args: any) {
const { models, messages } = args;
const results = await Promise.all(
models.map(model =>
this.llmService.chatCompletion(model, messages)
)
);
return {
content: [{
type: 'text',
text: JSON.stringify(
results.map((result, i) => ({
model: models[i],
response: result
})),
null,
2
)
}]
};
}
private async consensusCompletion(args: any) {
const { models, prompt } = args;
const messages = [{ role: 'user', content: prompt }];
const results = await Promise.all(
models.map(model =>
this.llmService.chatCompletion(model, messages)
)
);
// Simple consensus: most common response
const responseCounts = new Map<string, number>();
for (const result of results) {
responseCounts.set(result, (responseCounts.get(result) || 0) + 1);
}
let consensus = '';
let maxCount = 0;
for (const [response, count] of responseCounts.entries()) {
if (count > maxCount) {
consensus = response;
maxCount = count;
}
}
return {
content: [{
type: 'text',
text: consensus,
annotations: {
agreementCount: maxCount,
totalModels: models.length,
allResponses: results
}
}]
};
}
}
3. Resource Providers
// mcp-gateway/resources/file-resources.ts
import { Resource } from '@modelcontextprotocol/sdk/types.js';
import { promises as fs } from 'fs';
import path from 'path';
export class FileResources {
private workspaceRoot: string;
constructor(workspaceRoot = '/workspace') {
this.workspaceRoot = workspaceRoot;
}
listResources(): Resource[] {
// In production, dynamically list workspace files
return [
{
uri: 'file:///workspace/src/main.ts',
name: 'main.ts',
description: 'Application entry point',
mimeType: 'text/typescript'
},
{
uri: 'file:///workspace/package.json',
name: 'package.json',
description: 'Package configuration',
mimeType: 'application/json'
}
];
}
async readResource(uri: string) {
// Parse file:// URI
const filePath = uri.replace('file://', '');
const fullPath = path.resolve(this.workspaceRoot, filePath);
// Security: prevent path traversal
if (!fullPath.startsWith(this.workspaceRoot)) {
throw new Error('Invalid file path');
}
const content = await fs.readFile(fullPath, 'utf-8');
const mimeType = this.getMimeType(filePath);
return {
contents: [{
uri,
mimeType,
text: content
}]
};
}
private getMimeType(filePath: string): string {
const ext = path.extname(filePath);
const mimeTypes: Record<string, string> = {
'.ts': 'text/typescript',
'.tsx': 'text/typescript',
'.js': 'text/javascript',
'.jsx': 'text/javascript',
'.json': 'application/json',
'.md': 'text/markdown',
'.txt': 'text/plain'
};
return mimeTypes[ext] || 'text/plain';
}
}
4. Prompt Registry
// mcp-gateway/prompts/index.ts
import { Prompt } from '@modelcontextprotocol/sdk/types.js';
export class PromptRegistry {
private prompts: Map<string, Prompt> = new Map();
constructor() {
this.registerDefaultPrompts();
}
private registerDefaultPrompts() {
this.prompts.set('code-review', {
name: 'code-review',
description: 'Review code for quality, security, and best practices',
arguments: [
{
name: 'code',
description: 'Code to review',
required: true
},
{
name: 'level',
description: 'Review level: basic, intermediate, advanced',
required: false
}
]
});
this.prompts.set('explain-code', {
name: 'explain-code',
description: 'Explain how code works',
arguments: [
{
name: 'code',
description: 'Code to explain',
required: true
},
{
name: 'audience',
description: 'Target audience: beginner, intermediate, expert',
required: false
}
]
});
this.prompts.set('refactor', {
name: 'refactor',
description: 'Suggest refactoring improvements',
arguments: [
{
name: 'code',
description: 'Code to refactor',
required: true
},
{
name: 'goals',
description: 'Refactoring goals: performance, readability, maintainability',
required: false
}
]
});
}
listPrompts(): Prompt[] {
return Array.from(this.prompts.values());
}
getPrompt(name: string, args?: any) {
const prompt = this.prompts.get(name);
if (!prompt) {
throw new Error(`Prompt not found: ${name}`);
}
let messages: Array<{ role: string; content: { type: string; text: string } }> = [];
switch (name) {
case 'code-review':
messages = [{
role: 'user',
content: {
type: 'text',
text: `Review the following code (${args?.level || 'intermediate'} level):\n\n${args.code}\n\nProvide feedback on:\n- Code quality\n- Security issues\n- Best practices\n- Potential bugs`
}
}];
break;
case 'explain-code':
messages = [{
role: 'user',
content: {
type: 'text',
text: `Explain how this code works for a ${args?.audience || 'intermediate'} developer:\n\n${args.code}`
}
}];
break;
case 'refactor':
messages = [{
role: 'user',
content: {
type: 'text',
text: `Suggest refactoring improvements for this code (goals: ${args?.goals || 'general improvement'}):\n\n${args.code}\n\nProvide:\n1. Specific issues\n2. Suggested changes\n3. Refactored code`
}
}];
break;
}
return { messages };
}
}
5. Cache Service
// mcp-gateway/services/cache-service.ts
import { createClient } from 'redis';
export class CacheService {
private client: any;
constructor() {
this.client = createClient({
url: process.env.REDIS_URL || 'redis://localhost:6379'
});
this.client.connect();
}
async get(key: string): Promise<any | null> {
const value = await this.client.get(key);
return value ? JSON.parse(value) : null;
}
async set(key: string, value: any, ttl = 3600): Promise<void> {
await this.client.setEx(key, ttl, JSON.stringify(value));
}
async delete(key: string): Promise<void> {
await this.client.del(key);
}
async clear(): Promise<void> {
await this.client.flushAll();
}
}
Rationale
Why Unified MCP Gateway?
Single Integration Point:
- ✅ All llm providers accessible via one interface
- ✅ Consistent tool/resource/prompt access
- ✅ Easier to add new providers
Caching & Optimization:
- ✅ Cache llm responses (save API costs)
- ✅ Batch similar requests
- ✅ Rate limiting
Security & Control:
- ✅ Centralized auth/permission checks
- ✅ Request logging/auditing
- ✅ Resource access control
Why Separate Tool Providers?
Modularity:
- ✅ Easy to add/remove tool providers
- ✅ Each provider focused on one domain
- ✅ Testable in isolation
Performance:
- ✅ Lazy-load providers as needed
- ✅ Parallel tool execution
- ✅ Provider-specific optimizations
Why Redis for Caching?
Performance:
- ✅ In-memory, sub-millisecond latency
- ✅ TTL support (auto-expiration)
- ✅ Pub/sub for cache invalidation
Scalability:
- ✅ Horizontal scaling (Redis Cluster)
- ✅ High throughput (100K+ ops/sec)
- ✅ Low memory overhead
Alternatives Considered
Alternative 1: Direct MCP Servers per Provider
Pros:
- Simple (no gateway)
- Provider isolation
Cons:
- ❌ Client must manage multiple servers
- ❌ No unified caching
- ❌ Duplicate auth/logging
Rejected: Too complex for client
Alternative 2: HTTP REST API
Pros:
- Simple HTTP requests
- Wide tooling support
Cons:
- ❌ Not MCP-compliant
- ❌ No standard for tools/resources
- ❌ Less efficient than stdio/WebSocket
Rejected: MCP is the standard
Alternative 3: In-Memory Cache (No Redis)
Pros:
- Simple (no external dependency)
- Fast (in-process)
Cons:
- ❌ Lost on restart
- ❌ Can't scale horizontally
- ❌ Memory limited
Rejected: Need persistence and scaling
Consequences
Positive
✅ Unified Interface: Single MCP gateway for all providers ✅ Efficient: Caching reduces API costs and latency ✅ Extensible: Easy to add new tools/resources/prompts ✅ Secure: Centralized auth and access control ✅ Standard: MCP-compliant (works with any MCP client) ✅ Scalable: Redis caching, horizontal scaling
Negative
❌ Complexity: Extra layer between client and providers ❌ Single Point of Failure: Gateway down = no MCP access ❌ Latency: Extra hop adds ~10-20ms ❌ Cache Invalidation: Need strategy for stale data
Mitigation
Complexity:
- Comprehensive documentation
- Example integrations
- Testing framework
SPOF:
- Deploy multiple gateway instances
- Health checks + auto-restart
- Fallback to direct provider access
Latency:
- Optimize gateway code
- Use local Redis (same machine)
- Monitor and profile
Cache Invalidation:
- TTL on all cache entries
- Manual invalidation API
- Cache versioning
Implementation Plan
Phase 1: Core Gateway ✅
- MCP server setup
- Tool/resource/prompt handlers
- Basic llm tool provider
Phase 2: Multi-Provider Support 🔲
- All 6 llm providers integrated
- Provider-specific optimizations
- Model listing per provider
Phase 3: Resource Providers 🔲
- File resources
- Session resources
- Agent resources
Phase 4: Caching 🔲
- Redis integration
- Cache key strategy
- TTL configuration
- Cache invalidation API
Phase 5: Advanced Features 🔲
- Prompt library (user-customizable)
- Parallel/consensus tools
- Streaming responses
- Rate limiting
Phase 6: Production Hardening 🔲
- Monitoring (Prometheus)
- Logging (structured)
- Error tracking (Sentry)
- Performance testing
Success Metrics
Performance:
- < 50ms gateway overhead
- > 80% cache hit rate (for repeated queries)
- 1000+ requests/second throughput
Reliability:
- 99.9% uptime
- < 0.1% error rate
- Graceful degradation on provider failures
Developer Experience:
- < 5 minutes to add new tool
- Clear error messages
- Comprehensive examples
Related Decisions
- ADR-010: MCP Protocol - Original MCP decision
- ADR-015: Multi-llm Providers - Provider support
- ADR-017: WebSocket Backend - Transport layer
References
MCP Specification:
Redis:
Status: ✅ Accepted Next Review: 2025-11-06 (1 month) Last Updated: 2025-10-06