ADR 008: Use Kimi (Moonshot AI) for Vision Analysis
Status
Accepted
Context
The pipeline requires vision analysis for extracted video frames. Initially, we used Anthropic Claude Vision, but we want to:
- Support multiple vision providers for redundancy
- Potentially reduce costs (Kimi may offer different pricing)
- Provide flexibility for users in different regions
- Support users who may have access to one API but not another
Decision
Add support for Kimi (Moonshot AI) Vision API as an alternative to Claude Vision.
Kimi Vision Capabilities
- Model: moonshot-v1-8k-vision-preview
- Image Support: PNG, JPEG, WEBP, GIF
- Context Window: 8K tokens
- Strengths:
- Strong OCR capabilities
- Good at technical diagrams
- Competitive pricing
- Available in China/APAC regions
Implementation
class KimiVisionAnalyzer:
API_BASE = "https://api.moonshot.cn/v1"
async def analyze_frame(self, frame: Frame, context: str) -> ImageAnalysis:
# Encode image to base64
# Send to Kimi API with structured prompt
# Parse response into ImageAnalysis
Provider Selection
class VisionProvider:
@staticmethod
def create(provider: str = "auto"):
if provider == "auto":
# Auto-detect based on available API keys
if os.getenv('KIMI_API_KEY'):
return KimiVisionAnalyzer()
elif os.getenv('ANTHROPIC_API_KEY'):
return VisionAnalyzer() # Claude
Configuration
PipelineConfig(
vision_provider="auto" # "auto", "kimi", or "claude"
)
Environment Variables:
export KIMI_API_KEY="..." # For Kimi
export MOONSHOT_API_KEY="..." # Alternative name
export ANTHROPIC_API_KEY="..." # For Claude
Consequences
Positive
- ✅ Multiple provider options
- ✅ Regional availability (Kimi strong in APAC)
- ✅ Cost flexibility
- ✅ Redundancy if one service is down
- ✅ Users can choose based on existing subscriptions
Negative
- ⚠️ Two APIs to maintain
- ⚠️ Slight differences in output format
- ⚠️ Different rate limits to manage
Notes
- Kimi API docs: https://platform.moonshot.cn/docs
- Auto-detection prioritizes Kimi if both keys present
- Both providers return same ImageAnalysis structure