Skip to main content

ADR-015: Support Multiple llm Providers

Status: Accepted Date: 2025-10-06 Deciders: Development Team Related: ADR-010 (MCP Protocol)


Context

The initial implementation supported only LM Studio (local) and Claude Code. Users need flexibility to:

  • Use different llm providers based on task requirements
  • Compare responses across providers (parallel mode)
  • Chain different models for better results (sequential mode)
  • Work offline with local models or online with cloud APIs
  • Optimize costs by using free local models when possible
  • Leverage specialized models (coding, reasoning, etc.)

Decision

We will support 6 llm providers with a unified interface:

  1. Claude Code (Default) - Local CLI, no API key with Max account
  2. LM Studio - Local inference, 70B+ models
  3. Ollama - Local lightweight models
  4. OpenAI - Cloud GPT models
  5. Anthropic Claude - Cloud API
  6. Google Gemini - Cloud multimodal
  7. xAI Grok - Cloud integration

Implementation Strategy

Unified Interface:

interface llmService {
chatCompletion(model: string, messages: Message[]): Promise<string>;
streamChatCompletion(model: string, messages: Message[], onChunk): Promise<void>;
getAvailableModels(): Promise<llmModel[]>;
}

Provider Detection:

  • Model ID prefix determines provider (gpt-* = OpenAI, claude-* = Anthropic, etc.)
  • Local providers auto-detected (LM Studio, Ollama)
  • Cloud providers require API keys

Priority Order (in model list):

  1. Local providers first (Claude Code, LM Studio, Ollama)
  2. Cloud providers second (OpenAI, Anthropic, Gemini, Grok)
  3. Within each group: alphabetical or by capability

Rationale

Why Multiple Providers?

Flexibility:

  • Different models excel at different tasks
  • Users can choose based on privacy, cost, quality needs
  • No vendor lock-in

Cost Optimization:

  • Use free local models for drafts
  • Use expensive cloud models for final output
  • Claude Code free with Max subscription

Offline Capability:

  • Local models (LM Studio, Ollama, Claude Code) work offline
  • Essential for privacy-sensitive work

Comparison & Synthesis:

  • Parallel mode: Compare providers side-by-side
  • Sequential mode: Chain models for refinement
  • Consensus mode: Synthesize best answer

Why Claude Code as Default?

  1. No API Key - Works with Claude Max subscription
  2. Tool Access - Built-in Read, Write, Bash capabilities
  3. Local Execution - Privacy preserved
  4. Best Integration - Native theia IDE integration
  5. Quality - Excellent code generation and review

Why Include Cloud Providers?

OpenAI:

  • Industry-leading GPT-4 models
  • Large ecosystem and tooling
  • Well-documented API

Anthropic Claude:

  • Long context (200K tokens)
  • Excellent reasoning
  • Alternative to Claude Code for API usage

Google Gemini:

  • Multimodal capabilities (future)
  • Free tier available
  • Good performance

xAI Grok:

  • Real-time information (when connected)
  • X platform integration
  • Experimental features

Alternatives Considered

Alternative 1: Single Provider (LM Studio Only)

Pros:

  • Simpler implementation
  • Completely private/offline
  • No API key management

Cons:

  • Limited flexibility
  • No access to best-in-class cloud models
  • Users stuck with local hardware limitations

Rejected: Too limiting for professional use

Alternative 2: OpenAI Only

Pros:

  • Single API to maintain
  • Consistent quality
  • Good documentation

Cons:

  • Requires API key (cost)
  • No offline capability
  • Privacy concerns
  • Vendor lock-in

Rejected: Conflicts with privacy-first goals

Alternative 3: Plugin System for Providers

Pros:

  • Extensible
  • Community can add providers
  • Clean separation

Cons:

  • Complex implementation
  • Harder to maintain
  • Overkill for initial version

Deferred: Consider for v2.0


Implementation Details

Provider Configuration

Environment Variables:

# Local providers (optional overrides)
LM_STUDIO_API=http://host.docker.internal:1234/v1
OLLAMA_API=http://host.docker.internal:11434/v1

# Cloud providers (required)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=...
GROK_API_KEY=...

Model Detection

Auto-detection:

  • LM Studio: Poll /v1/models endpoint
  • Ollama: Poll /api/tags endpoint
  • Claude Code: Always available (CLI-based)

Cloud Models:

  • Static list (updated periodically)
  • Marked with requiresApiKey: true
  • Disabled if API key not configured

API Compatibility

OpenAI-Compatible (same API structure):

  • LM Studio
  • Ollama
  • OpenAI
  • Anthropic (with minor header differences)
  • xAI Grok

Custom Implementation:

  • Google Gemini (different API format)
  • Claude Code (CLI execution, not HTTP)

Error Handling

try {
const response = await llmService.chatCompletion(model, messages);
} catch (error) {
if (error.message.includes('API key')) {
// Show API key configuration UI
} else if (error.message.includes('connection')) {
// Try fallback provider or show offline models
} else {
// Generic error handling
}
}

Consequences

Positive

Flexibility: Users choose best model for each task ✅ Privacy: Local models available for sensitive work ✅ Cost Control: Free local models reduce API costs ✅ Quality: Access to best-in-class cloud models when needed ✅ Comparison: Parallel/consensus modes enable multi-model workflows ✅ Offline: Full functionality without internet ✅ Future-proof: Easy to add new providers

Negative

Complexity: More code to maintain ❌ Testing: Must test each provider integration ❌ Documentation: Users need guidance on provider selection ❌ API Keys: Users must manage multiple API keys ❌ Error Handling: Different error formats per provider

Mitigation

Complexity:

  • Unified interface reduces code duplication
  • Shared utilities for OpenAI-compatible APIs
  • Well-documented provider implementations

Testing:

  • Mock API responses for unit tests
  • Integration tests with real APIs (optional)
  • Manual testing checklist per provider

Documentation:

  • Provider comparison guide
  • Best practices per use case
  • Clear setup instructions

API Key Management:

  • Secure environment variable storage
  • Clear error messages when keys missing
  • Settings UI for key configuration (future)


Success Metrics

Implementation Success:

  • 6 providers supported with unified interface
  • Local models work offline
  • Cloud models require API keys
  • Model list shows all available models
  • Provider auto-detection for LM Studio/Ollama

User Success:

  • Users can switch providers seamlessly
  • Parallel mode compares multiple providers
  • Sequential mode chains providers
  • Documentation guides provider selection
  • Cost tracking per provider (future)

Migration Path

From Single Provider (existing users):

  1. Existing LM Studio usage unchanged
  2. Claude Code auto-available with Max subscription
  3. Cloud providers opt-in via API keys
  4. Model dropdown shows all options
  5. User selects provider per conversation

Adding New Providers (future):

  1. Implement provider-specific API client
  2. Add to getAvailableModels()
  3. Update chatCompletion() routing
  4. Add provider documentation
  5. Update UI to show new provider

Documentation

User Docs:

Developer Docs:


References

Provider APIs:

Related Work:

  • LangChain (multi-provider llm framework)
  • LlamaIndex (similar provider abstraction)
  • OpenRouter (unified llm API)

Review Notes

Approved By: Development Team Date: 2025-10-06 Next Review: 2025-11-06 (1 month)

Action Items:

  • Implement provider abstraction
  • Add Ollama support
  • Add OpenAI support
  • Add Anthropic support
  • Add Gemini support
  • Add Grok support
  • Write provider comparison docs
  • Add cost tracking
  • Add provider failover
  • Add settings UI for API keys

Status: ✅ Implemented Version: 1.0 Last Updated: 2025-10-06