Judge Persona Quick Reference Card

The 4-Step Persona Creation Process

Step 1: EXTRACT Dimensions from Domain Documents

Source Documents → Stakeholders → Dimensions → Evidence

HIPAA Security Rule → Compliance Officer → Data Protection → "§164.312(a)(1)"
FDA 21 CFR Part 11 → Audit Specialist → Traceability → "§11.10(e)"
OWASP Top 10 → Security Engineer → Vulnerability Prevention → "A03:2021"

Extraction Prompt:

Given this regulatory document: [DOCUMENT]

Identify:
1. Key stakeholder roles who would evaluate compliance
2. Their primary evaluation dimensions
3. Specific evidence/requirements they would check

Output as: (Stakeholder, Dimension, Evidence Citation)

Step 2: CONSTRUCT the 5-Attribute Persona

Attribute	Purpose	Example
Demographic	Establish credibility	"Dr. Sarah Chen, CISSP, 15 years security"
Evaluative Dimension	Focus area	"Security Vulnerability Assessment"
Domain Expertise	Knowledge scope	"OWASP, HIPAA technical safeguards"
Psychological Traits	Evaluation style	"Low risk tolerance, high detail"
Social Role	Organizational position	"Guardian of system integrity"

Step 3: DESIGN the Question-Specific Rubric

Rubric Structure:

CONTEXT: What are we evaluating?
DIMENSIONS: 3-5 evaluation criteria with weights
SCALE: 5-point with concrete examples
CHAIN-OF-THOUGHT: Step-by-step evaluation instructions
OUTPUT: JSON schema for structured response

Scoring Best Practices:

✅ Binary (Pass/Fail) - Most reliable
✅ 3-point (Excellent/Acceptable/Poor) - Good balance
✅ 5-point with clear rubric - Acceptable
❌ 10+ point scales - Too granular for LLMs

Step 4: DIVERSIFY the Judge Panel

Model Family Diversity (Required):

Judge 1: Claude family
Judge 2: GPT family  
Judge 3: DeepSeek/Mistral family

Perspective Diversity:

Skeptic (finds issues)
Advocate (seeks approval)
Neutral (balanced)

Expertise Diversity:

Security specialist
Compliance specialist
Quality specialist
Domain specialist

Core Judge Personas for Coditect

Persona	Primary Focus	Key Question
Security Architect	Vulnerabilities	"Can this be exploited?"
Compliance Officer	Regulatory adherence	"Will this pass audit?"
Quality Engineer	Code correctness	"Does this work correctly?"
Domain Expert	Business logic	"Is this clinically accurate?"
Ethics Judge	Safety/harm	"Could this hurt someone?"

Voting Mechanisms

For Binary Decisions:

# 2/3 supermajority for approval
approved = (sum(votes) / len(votes)) >= 0.67

For Scoring:

# Weighted average with veto
score = weighted_average(judge_scores, judge_weights)
has_veto = any(score <= 1 for score in judge_scores.values())

Debate Triggers

Trigger debate when:

Score variance > 1.0 point
Any judge gives score ≤ 1 (critical failure)
Pass/fail split among judges

Anti-Bias Checklist

Randomize presentation order
Use diverse model families
Require evidence citations
Independent first assessment (blind to others)
Add debiasing instructions to prompts

Key Research Sources

MAJ-EVAL (Chen et al. 2025) - Automatic persona extraction
PoLL (Verga et al. 2024) - Diverse panel reduces bias 40-60%
G-Eval (Liu et al. 2023) - Chain-of-thought evaluation
Rubric Is All You Need (2025) - Question-specific rubrics

Keep this card handy when designing new judge personas

The 4-Step Persona Creation Process​

Step 1: EXTRACT Dimensions from Domain Documents​

Step 2: CONSTRUCT the 5-Attribute Persona​

Step 3: DESIGN the Question-Specific Rubric​

Step 4: DIVERSIFY the Judge Panel​

Core Judge Personas for Coditect​

Voting Mechanisms​

Debate Triggers​

Anti-Bias Checklist​

Key Research Sources​