ADR 008: Use Kimi (Moonshot AI) for Vision Analysis

Status

Accepted

Context

The pipeline requires vision analysis for extracted video frames. Initially, we used Anthropic Claude Vision, but we want to:

Support multiple vision providers for redundancy
Potentially reduce costs (Kimi may offer different pricing)
Provide flexibility for users in different regions
Support users who may have access to one API but not another

Decision

Add support for Kimi (Moonshot AI) Vision API as an alternative to Claude Vision.

Kimi Vision Capabilities

Model: moonshot-v1-8k-vision-preview
Image Support: PNG, JPEG, WEBP, GIF
Context Window: 8K tokens
Strengths:
- Strong OCR capabilities
- Good at technical diagrams
- Competitive pricing
- Available in China/APAC regions

Implementation

class KimiVisionAnalyzer:
    API_BASE = "https://api.moonshot.cn/v1"
    
    async def analyze_frame(self, frame: Frame, context: str) -> ImageAnalysis:
        # Encode image to base64
        # Send to Kimi API with structured prompt
        # Parse response into ImageAnalysis

Provider Selection

class VisionProvider:
    @staticmethod
    def create(provider: str = "auto"):
        if provider == "auto":
            # Auto-detect based on available API keys
            if os.getenv('KIMI_API_KEY'):
                return KimiVisionAnalyzer()
            elif os.getenv('ANTHROPIC_API_KEY'):
                return VisionAnalyzer()  # Claude

Configuration

PipelineConfig(
    vision_provider="auto"  # "auto", "kimi", or "claude"
)

Environment Variables:

export KIMI_API_KEY="..."           # For Kimi
export MOONSHOT_API_KEY="..."       # Alternative name
export ANTHROPIC_API_KEY="..."      # For Claude

Consequences

Positive

✅ Multiple provider options
✅ Regional availability (Kimi strong in APAC)
✅ Cost flexibility
✅ Redundancy if one service is down
✅ Users can choose based on existing subscriptions

Negative

⚠️ Two APIs to maintain
⚠️ Slight differences in output format
⚠️ Different rate limits to manage

Notes

Kimi API docs: https://platform.moonshot.cn/docs
Auto-detection prioritizes Kimi if both keys present
Both providers return same ImageAnalysis structure

Status​

Context​

Decision​

Kimi Vision Capabilities​

Implementation​

Provider Selection​

Configuration​

Consequences​

Positive​

Negative​

Notes​