Skip to main content

LLM-based Judge Agent (H.3.5.3).

Provides LLM-backed judge evaluations using the MultiModelClient for multi-provider support with fallback and retry logic.

Features:

  • Uses JudgePersona for evaluation rubrics and prompts
  • Supports multiple LLM providers via MultiModelClient
  • Records full provenance (model_used, token_usage, latency)
  • Automatic fallback to backup model on failure

File: llm_judge.py

Classes

LLMJudge

LLM-based judge using multi-model client.

LLMJudgePanel

Panel of LLM judges using multiple personas.

Functions

create_llm_judge_panel(persona_ids)

Create an LLM judge panel with default configuration.

model()

Get the primary model for this judge.

evaluate(document, votes)

Evaluate analyst votes using LLM.

get_judge_info()

Get information about all judges in the panel.

verify_model_diversity()

Verify model diversity requirements.

get_family(model)

No description