Skip to main content

ADR-201: Multi-Provider LLM Gateway

Status: Accepted Date: 2026-02-15 Author: Hal Casteel Track: H.3.10

Context

Problem Statement

Claude Code is locked to Anthropic's API by default. However, multiple LLM providers now expose Anthropic Messages API-compatible endpoints, meaning the same anthropic SDK works with only a base_url change. Additionally, proxy services like OpenRouter provide Anthropic-compatible access to 400+ models from providers that don't natively support the Anthropic API format.

Users need a zero-friction way to switch between providers for:

  1. Cost optimization — DeepSeek input tokens cost $0.28/M vs Anthropic's ~$3/M (10x cheaper)
  2. Capability matching — MiniMax scores 80.2% on SWE-Bench; DeepSeek excels at reasoning
  3. Redundancy — Provider outages shouldn't block development
  4. Experimentation — Compare model quality on the same task across providers

Current State

  • ADR-200 added MiniMax as the 5th provider in coditect-core's MoE system (backend)
  • Claude Code supports ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN env var overrides
  • ANTHROPIC_AUTH_TOKEN takes priority over ANTHROPIC_API_KEY in the auth chain
  • No built-in Claude Code mechanism for provider switching (no --provider flag)

Requirements

  1. Switch providers with a single command (no file edits)
  2. Maintain existing Anthropic API key for default Claude usage
  3. Securely store per-provider API keys
  4. Support both direct providers and proxy gateways
  5. Allow model override at invocation time

Decision

Implement a shell alias-based multi-provider gateway using Claude Code's native environment variable override mechanism. Each provider gets a dedicated alias (claude-{provider}) that sets the required env vars inline.

Architecture

┌──────────────────────────────────────────────────────────┐
│ User's Terminal │
│ │
│ claude → Anthropic (default, ANTHROPIC_API_KEY)│
│ claude-minimax → MiniMax (api.minimax.io/anthropic) │
│ claude-deepseek → DeepSeek (api.deepseek.com/anthropic)│
│ claude-kimi → Kimi (api.kimi.com/coding) │
│ claude-glm → GLM/z.ai (api.z.ai/api/anthropic) │
│ claude-openrouter → OpenRouter (openrouter.ai/api) │
│ │
│ All aliases → same `claude` binary │
│ Different env vars → different LLM backend │
└──────────────────────────────────────────────────────────┘

Environment Variable Chain

Claude Code resolves credentials in this priority order:

  1. ANTHROPIC_AUTH_TOKEN (highest — used by aliases)
  2. ANTHROPIC_API_KEY (default — set in .zshrc)
  3. Settings file (~/.claude/settings.json)

This means aliases using ANTHROPIC_AUTH_TOKEN naturally override the default ANTHROPIC_API_KEY without unsetting it.

Alias Pattern

Each alias follows this template:

# Standard pattern (API key from file)
alias claude-{provider}='\
ANTHROPIC_BASE_URL="{endpoint}" \
ANTHROPIC_AUTH_TOKEN="$(cat ~/.{Provider}AZ1.api.key 2>/dev/null | tr -d '"'"'[:space:]'"'"')" \
ANTHROPIC_MODEL="{primary_model}" \
ANTHROPIC_SMALL_FAST_MODEL="{fast_model}" \
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 \
claude'

# Function pattern (Kimi — auto-starts token-refreshing proxy)
# Note: zsh requires `function name-with-hyphens { }` syntax, not `name() { }`
function claude-kimi {
kimi-proxy --check >/dev/null 2>&1 || kimi-proxy
ANTHROPIC_BASE_URL="http://127.0.0.1:18462" \
ANTHROPIC_AUTH_TOKEN="$(kimi-token 2>/dev/null)" \
ANTHROPIC_MODEL="kimi-k2.5" \
ANTHROPIC_SMALL_FAST_MODEL="kimi-k2.5" \
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 \
claude "$@"
}

Authentication

API Key Pattern (MiniMax, DeepSeek, GLM, OpenRouter)

API keys are stored in home directory files following the pattern ~/.{Provider}AZ1.api.key:

ProviderKey FileKey Source
MiniMax~/.MiniMaxAZ1.api.keyhttps://platform.minimax.io
DeepSeek~/.DeepSeekAZ1.api.keyhttps://platform.deepseek.com
GLM~/.GLMAZ1.api.keyhttps://open.z.ai
OpenRouter~/.OpenRouterAZ1.api.keyhttps://openrouter.ai/settings/keys

Security: Key files should be chmod 600 (owner-read-only). The cat ... | tr -d '[:space:]' pipeline strips any trailing newlines or whitespace from key files.

OAuth Pattern (Kimi)

Kimi uses OAuth device flow authentication (NOT platform API keys):

  1. Login: Run kimi CLI → triggers browser-based OAuth login
  2. Credentials: OAuth tokens stored at ~/.kimi/credentials/kimi-code.json
  3. Token refresh: kimi-token helper script (~/.local/bin/kimi-token) auto-refreshes expired tokens using the refresh_token (30-day validity)
  4. Access token: ~15 minute validity, refreshed on each claude-kimi invocation

Why OAuth? Kimi's api.kimi.com/coding endpoint is Anthropic Messages API-compatible but only accepts OAuth access tokens, not platform API keys (sk-kimi-* keys are for the separate api.moonshot.ai platform).

Token lifecycle:

kimi login → OAuth device flow → access_token (15m) + refresh_token (30d)

claude-kimi → starts kimi-proxy (if not running) → ANTHROPIC_BASE_URL=localhost:18462

kimi-proxy (daemon on port 18462) ←── Claude Code sends requests

reads fresh token from ~/.kimi/credentials/kimi-code.json
↓ (refreshes via auth.kimi.com if <5m remaining)
forwards request to api.kimi.com/coding with fresh token

streams response back to Claude Code

Token-refreshing proxy (kimi-proxy):

  • ~/.local/bin/kimi-proxy — local HTTP proxy that injects fresh OAuth tokens per-request
  • Auto-started by claude-kimi function if not already running
  • Daemon on 127.0.0.1:18462, PID file at ~/.local/run/kimi-proxy.pid
  • Supports SSE streaming for real-time responses
  • Sessions run indefinitely — no 15-minute token expiry limit
  • Stop with: kimi-proxy --stop

Provider Configuration

Direct Providers (Anthropic Messages API-compatible)

ProviderBase URLPrimary ModelFast ModelContextInput $/MOutput $/M
MiniMaxhttps://api.minimax.io/anthropicMiniMax-M2.5MiniMax-M2.5204K$0.15$1.20
DeepSeekhttps://api.deepseek.com/anthropicdeepseek-reasonerdeepseek-chat128K$0.28$0.42
Kimihttps://api.kimi.com/codingkimi-k2.5kimi-k2.5128K$0.60$2.50
GLMhttps://api.z.ai/api/anthropicglm-4.6glm-4.5-air128K$0.55$2.20

Proxy Provider (Universal Gateway)

ProviderBase URLDefault ModelAccess To
OpenRouterhttps://openrouter.ai/apianthropic/claude-sonnet-4400+ models

OpenRouter model override at invocation:

ANTHROPIC_MODEL="google/gemini-2.5-pro" claude-openrouter
ANTHROPIC_MODEL="openai/gpt-4o" claude-openrouter
ANTHROPIC_MODEL="meta-llama/llama-4-maverick" claude-openrouter

Provider Quirks

ProviderQuirkMitigation
MiniMaxReturns ThinkingBlock + TextBlock by default (no extended thinking requested)Iterate response.content for type == "text", never assume content[0]
MiniMaxTemperature must be > 0.0 (exclusive range)Guard with max(0.01, temperature)
MiniMaxNo vision/image inputText-only tasks
DeepSeekCache hits at $0.028/M (10x cheaper than uncached)Prefer for repetitive tasks
KimiOAuth-only auth — sk-kimi-* platform keys rejected on api.kimi.com/codingUse kimi-token helper for OAuth access tokens
KimiAccess tokens expire in ~15 minuteskimi-proxy auto-refreshes per-request; sessions run indefinitely
KimiAccepts kimi-k2.5 model name but returns kimi-for-coding in responseParse model field accordingly
OpenRouter5.5% fee on credit purchases (not per-token)Buy credits in bulk
OpenRouterModel names prefixed with provider/Use anthropic/claude-sonnet-4 format

SDK URL Construction

Critical: The Anthropic SDK appends /v1/messages to the base URL. Therefore, ANTHROPIC_BASE_URL must NOT include /v1:

ANTHROPIC_BASE_URL="https://api.kimi.com/coding"
↓ SDK appends /v1/messages
Final URL: https://api.kimi.com/coding/v1/messages ✓

ANTHROPIC_BASE_URL="https://api.kimi.com/coding/v1" ← WRONG
↓ SDK appends /v1/messages
Final URL: https://api.kimi.com/coding/v1/v1/messages ✗ (404)

Env Var Reference

VariablePurposeExample
ANTHROPIC_BASE_URLProvider API endpointhttps://api.deepseek.com/anthropic
ANTHROPIC_AUTH_TOKENProvider API key (overrides ANTHROPIC_API_KEY)Read from key file
ANTHROPIC_MODELPrimary model for main tasksdeepseek-reasoner
ANTHROPIC_SMALL_FAST_MODELFast model for subtasks/subagentsdeepseek-chat
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFICDisable telemetry to Anthropic1

Consequences

Positive

  1. Zero friction — switch providers with a single command, no file edits
  2. No interference — default claude command unchanged, aliases are additive
  3. Secure — keys in separate files, ANTHROPIC_AUTH_TOKEN overrides without unsetting base key
  4. Extensible — new providers added by creating one alias + one key file
  5. Cost savings — DeepSeek is ~10x cheaper than Anthropic for many tasks
  6. Redundancy — 5 independent providers, any outage has 4 fallbacks

Negative

  1. Manual key management — users must obtain and save API keys per provider
  2. No auto-routing — user must choose provider (unlike MoE backend which routes automatically)
  3. Provider-specific quirks — ThinkingBlock, temperature ranges, model naming vary
  4. Shell-specific — aliases work in zsh/bash but not in IDE integrations or scripts

Neutral

  1. Telemetry disabledCLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 prevents Anthropic usage tracking for non-Anthropic providers
  2. Same Claude Code binary — all aliases invoke the same claude CLI, only the backend differs
  3. OpenRouter double-hop — requests to OpenRouter add latency vs direct provider connections

Alternatives Considered

1. Claude Code settings.json Profiles

Configure multiple profiles in ~/.claude/settings.json with a --profile flag. Rejected: Claude Code doesn't support profiles. Would require upstream changes.

2. LiteLLM Proxy

Run a local LiteLLM proxy that translates between API formats. Rejected: Adds infrastructure (local proxy process), complexity, and latency. Unnecessary since target providers already support Anthropic API format.

3. Wrapper Script (claude-provider)

A shell script that reads a config file and sets env vars dynamically. Rejected: Over-engineered for 5 providers. Shell aliases are simpler, more transparent, and easier to debug.

4. Environment Module System

Use direnv or autoenv to set provider per project directory. Rejected: Project-level provider locking is too rigid. Users want to choose per-session.

ADRRelationship
ADR-073MoE Provider Flexibility — backend multi-provider support
ADR-122Unified LLM Component Architecture — per-provider directory structure
ADR-190Cross-LLM Bridge Architecture — vendor-agnostic orchestration layer
ADR-200MiniMax Provider Integration — first Anthropic-compatible third-party provider

Implementation

Completed

  • 5 shell aliases/functions added to ~/.zshrc
  • claude-minimax tested and working (MiniMax-M2.5)
  • ThinkingBlock fix deployed (f638bb56)
  • Key file convention established (~/.{Provider}AZ1.api.key)
  • Kimi OAuth auth via kimi-token helper (~/.local/bin/kimi-token)
  • kimi-proxy token-refreshing proxy (~/.local/bin/kimi-proxy) — infinite session support
  • claude-kimi tested and working (kimi-k2.5 via proxy)

Pending (User Action)

  • Obtain DeepSeek API key → save to ~/.DeepSeekAZ1.api.key
  • Obtain OpenRouter API key → save to ~/.OpenRouterAZ1.api.key
  • Obtain GLM API key → save to ~/.GLMAZ1.api.key

Track: H.3.10 (Multi-Provider LLM Gateway) Author: Hal Casteel Updated: 2026-02-15