ADR-015: Support Multiple llm Providers
Status: Accepted Date: 2025-10-06 Deciders: Development Team Related: ADR-010 (MCP Protocol)
Context
The initial implementation supported only LM Studio (local) and Claude Code. Users need flexibility to:
- Use different llm providers based on task requirements
- Compare responses across providers (parallel mode)
- Chain different models for better results (sequential mode)
- Work offline with local models or online with cloud APIs
- Optimize costs by using free local models when possible
- Leverage specialized models (coding, reasoning, etc.)
Decision
We will support 6 llm providers with a unified interface:
- Claude Code (Default) - Local CLI, no API key with Max account
- LM Studio - Local inference, 70B+ models
- Ollama - Local lightweight models
- OpenAI - Cloud GPT models
- Anthropic Claude - Cloud API
- Google Gemini - Cloud multimodal
- xAI Grok - Cloud integration
Implementation Strategy
Unified Interface:
interface llmService {
chatCompletion(model: string, messages: Message[]): Promise<string>;
streamChatCompletion(model: string, messages: Message[], onChunk): Promise<void>;
getAvailableModels(): Promise<llmModel[]>;
}
Provider Detection:
- Model ID prefix determines provider (
gpt-*= OpenAI,claude-*= Anthropic, etc.) - Local providers auto-detected (LM Studio, Ollama)
- Cloud providers require API keys
Priority Order (in model list):
- Local providers first (Claude Code, LM Studio, Ollama)
- Cloud providers second (OpenAI, Anthropic, Gemini, Grok)
- Within each group: alphabetical or by capability
Rationale
Why Multiple Providers?
Flexibility:
- Different models excel at different tasks
- Users can choose based on privacy, cost, quality needs
- No vendor lock-in
Cost Optimization:
- Use free local models for drafts
- Use expensive cloud models for final output
- Claude Code free with Max subscription
Offline Capability:
- Local models (LM Studio, Ollama, Claude Code) work offline
- Essential for privacy-sensitive work
Comparison & Synthesis:
- Parallel mode: Compare providers side-by-side
- Sequential mode: Chain models for refinement
- Consensus mode: Synthesize best answer
Why Claude Code as Default?
- No API Key - Works with Claude Max subscription
- Tool Access - Built-in Read, Write, Bash capabilities
- Local Execution - Privacy preserved
- Best Integration - Native theia IDE integration
- Quality - Excellent code generation and review
Why Include Cloud Providers?
OpenAI:
- Industry-leading GPT-4 models
- Large ecosystem and tooling
- Well-documented API
Anthropic Claude:
- Long context (200K tokens)
- Excellent reasoning
- Alternative to Claude Code for API usage
Google Gemini:
- Multimodal capabilities (future)
- Free tier available
- Good performance
xAI Grok:
- Real-time information (when connected)
- X platform integration
- Experimental features
Alternatives Considered
Alternative 1: Single Provider (LM Studio Only)
Pros:
- Simpler implementation
- Completely private/offline
- No API key management
Cons:
- Limited flexibility
- No access to best-in-class cloud models
- Users stuck with local hardware limitations
Rejected: Too limiting for professional use
Alternative 2: OpenAI Only
Pros:
- Single API to maintain
- Consistent quality
- Good documentation
Cons:
- Requires API key (cost)
- No offline capability
- Privacy concerns
- Vendor lock-in
Rejected: Conflicts with privacy-first goals
Alternative 3: Plugin System for Providers
Pros:
- Extensible
- Community can add providers
- Clean separation
Cons:
- Complex implementation
- Harder to maintain
- Overkill for initial version
Deferred: Consider for v2.0
Implementation Details
Provider Configuration
Environment Variables:
# Local providers (optional overrides)
LM_STUDIO_API=http://host.docker.internal:1234/v1
OLLAMA_API=http://host.docker.internal:11434/v1
# Cloud providers (required)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=...
GROK_API_KEY=...
Model Detection
Auto-detection:
- LM Studio: Poll
/v1/modelsendpoint - Ollama: Poll
/api/tagsendpoint - Claude Code: Always available (CLI-based)
Cloud Models:
- Static list (updated periodically)
- Marked with
requiresApiKey: true - Disabled if API key not configured
API Compatibility
OpenAI-Compatible (same API structure):
- LM Studio
- Ollama
- OpenAI
- Anthropic (with minor header differences)
- xAI Grok
Custom Implementation:
- Google Gemini (different API format)
- Claude Code (CLI execution, not HTTP)
Error Handling
try {
const response = await llmService.chatCompletion(model, messages);
} catch (error) {
if (error.message.includes('API key')) {
// Show API key configuration UI
} else if (error.message.includes('connection')) {
// Try fallback provider or show offline models
} else {
// Generic error handling
}
}
Consequences
Positive
✅ Flexibility: Users choose best model for each task ✅ Privacy: Local models available for sensitive work ✅ Cost Control: Free local models reduce API costs ✅ Quality: Access to best-in-class cloud models when needed ✅ Comparison: Parallel/consensus modes enable multi-model workflows ✅ Offline: Full functionality without internet ✅ Future-proof: Easy to add new providers
Negative
❌ Complexity: More code to maintain ❌ Testing: Must test each provider integration ❌ Documentation: Users need guidance on provider selection ❌ API Keys: Users must manage multiple API keys ❌ Error Handling: Different error formats per provider
Mitigation
Complexity:
- Unified interface reduces code duplication
- Shared utilities for OpenAI-compatible APIs
- Well-documented provider implementations
Testing:
- Mock API responses for unit tests
- Integration tests with real APIs (optional)
- Manual testing checklist per provider
Documentation:
- Provider comparison guide
- Best practices per use case
- Clear setup instructions
API Key Management:
- Secure environment variable storage
- Clear error messages when keys missing
- Settings UI for key configuration (future)
Related Decisions
- ADR-010: MCP Protocol - MCP enables tool access for Claude
- ADR-014: Eclipse theia - theia provides plugin system for providers
Success Metrics
Implementation Success:
- 6 providers supported with unified interface
- Local models work offline
- Cloud models require API keys
- Model list shows all available models
- Provider auto-detection for LM Studio/Ollama
User Success:
- Users can switch providers seamlessly
- Parallel mode compares multiple providers
- Sequential mode chains providers
- Documentation guides provider selection
- Cost tracking per provider (future)
Migration Path
From Single Provider (existing users):
- Existing LM Studio usage unchanged
- Claude Code auto-available with Max subscription
- Cloud providers opt-in via API keys
- Model dropdown shows all options
- User selects provider per conversation
Adding New Providers (future):
- Implement provider-specific API client
- Add to
getAvailableModels() - Update
chatCompletion()routing - Add provider documentation
- Update UI to show new provider
Documentation
User Docs:
- Multi-llm Provider Guide
- Quick Start
- Provider comparison matrix
- Setup instructions per provider
Developer Docs:
- llm Service API
- Provider integration patterns
- Adding new providers guide
References
Provider APIs:
Related Work:
- LangChain (multi-provider llm framework)
- LlamaIndex (similar provider abstraction)
- OpenRouter (unified llm API)
Review Notes
Approved By: Development Team Date: 2025-10-06 Next Review: 2025-11-06 (1 month)
Action Items:
- Implement provider abstraction
- Add Ollama support
- Add OpenAI support
- Add Anthropic support
- Add Gemini support
- Add Grok support
- Write provider comparison docs
- Add cost tracking
- Add provider failover
- Add settings UI for API keys
Status: ✅ Implemented Version: 1.0 Last Updated: 2025-10-06