LLM-based Judge Agent (H.3.5.3).
Provides LLM-backed judge evaluations using the MultiModelClient for multi-provider support with fallback and retry logic.
Features:
- Uses JudgePersona for evaluation rubrics and prompts
- Supports multiple LLM providers via MultiModelClient
- Records full provenance (model_used, token_usage, latency)
- Automatic fallback to backup model on failure
File: llm_judge.py
Classes
LLMJudge
LLM-based judge using multi-model client.
LLMJudgePanel
Panel of LLM judges using multiple personas.
Functions
create_llm_judge_panel(persona_ids)
Create an LLM judge panel with default configuration.
model()
Get the primary model for this judge.
evaluate(document, votes)
Evaluate analyst votes using LLM.
get_judge_info()
Get information about all judges in the panel.
verify_model_diversity()
Verify model diversity requirements.
get_family(model)
No description