Skip to main content

ADR-023: MoE Document Classification System

Status: Accepted Date: 2025-12-28 Deciders: Hal Casteel Categories: Architecture, AI/ML, Document Processing


Context

CODITECT manages a diverse ecosystem of documents including agents, commands, skills, guides, references, workflows, and ADRs. Each document type has distinct characteristics and organizational requirements.

Problem Statement

  1. Manual Classification: Documents were manually categorized, leading to inconsistencies
  2. Scale: With 3,458 components across 97 submodules, manual classification is unsustainable
  3. Accuracy: Single-model classification had ~60% accuracy on ambiguous documents
  4. Confidence: No mechanism to identify low-confidence classifications requiring human review

Constraints

  • Must achieve >95% accuracy on well-formed documents
  • Must provide full audit trail for all decisions
  • Must automatically identify uncertain classifications
  • Must process documents at <100ms latency
  • Must be extensible for new document types

Decision

We will implement a Mixture of Experts (MoE) classification system with multi-tier analysis, consensus voting, and escalation handling because it provides the highest accuracy while maintaining transparency and identifying uncertain classifications.

Architecture Overview

Document → [Fast-Path Analysts] → Consensus → [Judges] → Decision

Low Confidence?

[Deep Analysts] → Final Decision

Still Uncertain?

Human Review Required

Component Structure

scripts/moe_classifier/
├── classify.py # CLI entry point
├── batch_classify.py # Batch processing
├── core/
│ ├── models.py # Data models (Document, Vote, Result)
│ ├── orchestrator.py # Main coordination logic
│ ├── consensus.py # Voting aggregation
│ └── deep_analysts.py # Second-tier analysis
├── analysts/ # First-tier specialist agents
│ ├── title_analyst.py
│ ├── content_analyst.py
│ ├── structure_analyst.py
│ └── frontmatter_analyst.py
└── judges/ # Validation agents
├── consistency_judge.py
└── confidence_judge.py

Document Types

Per ADR-018, valid classification targets:

TypeDescriptionKey Signals
agentAI agent definitionstype: agent, "You are a...", system prompts
commandSlash commandsinvocation: /xxx, command usage
skillReusable patternsSKILL.md, pattern definitions
guideUser tutorialsStep-by-step, troubleshooting
referenceAPI/architecture docsTables, specifications
workflowProcess definitionsPhases, automation steps
adrArchitecture decisionsContext/Decision/Consequences

Approval Types

class ApprovalType(Enum):
AUTO_APPROVED = "AUTO_APPROVED" # High confidence, no judge needed
JUDGE_APPROVED = "JUDGE_APPROVED" # Judges validated consensus
DEEP_ANALYSIS_APPROVED = "DEEP_ANALYSIS_APPROVED" # Deep analysts resolved
HUMAN_REVIEW_REQUIRED = "HUMAN_REVIEW_REQUIRED" # Needs manual review
ESCALATED = "ESCALATED" # Pending escalation

Classification Pipeline

Phase 1: Fast-Path Analysis

Multiple lightweight analysts run in parallel:

  1. Title Analyst - Pattern matching on filename/title
  2. Content Analyst - Keyword and phrase detection
  3. Structure Analyst - Header patterns, document structure
  4. Frontmatter Analyst - YAML metadata extraction

Each analyst produces a vote:

@dataclass
class AnalystVote:
agent: str # Analyst name
classification: str # Predicted type
confidence: float # 0.0 to 1.0
reasoning: str # Explanation

Phase 2: Consensus Calculation

Votes are aggregated using weighted consensus:

# Confidence-weighted voting
for vote in votes:
scores[vote.classification] += vote.confidence

# Normalize and select winner
winner = max(scores, key=scores.get)
confidence = scores[winner] / sum(scores.values())

Auto-Approval Thresholds:

  • Confidence >= 0.85 AND agreement >= 0.70 → AUTO_APPROVED
  • Otherwise → Run judges

Phase 3: Judge Validation

Judges validate the consensus decision:

  1. Consistency Judge - Checks vote agreement patterns
  2. Confidence Judge - Evaluates confidence distribution

If judges approve: JUDGE_APPROVED If judges reject: ESCALATED to deep analysis

Phase 4: Deep Analysis (Escalation Handling)

Four specialized deep analysts for uncertain documents:

1. SemanticSimilarityAnalyst

Compares document against known exemplar patterns:

EXEMPLAR_PATTERNS = {
"agent": {
"title_patterns": ["specialist", "expert", "agent", "orchestrator"],
"content_patterns": ["component_type: agent", "type: agent", "You are a..."],
"structure_patterns": ["## capabilities", "## when to use"]
},
# ... other types
}

2. ContextualAnalyst

Analyzes directory structure and sibling files:

DIRECTORY_HINTS = {
"agents": "agent",
"commands": "command",
"guides": "guide",
"workflows": "workflow"
}

3. ContentReasoningAnalyst

Deep content analysis with structured reasoning:

  • Header pattern analysis (ADR headers, guide steps)
  • Code block density
  • Persona pattern detection ("You are a...")
  • Frontmatter type declarations

4. CrossReferenceAnalyst

Analyzes links and document relationships:

  • Link targets (agents/, commands/, etc.)
  • Related document mentions
  • Frontmatter references

Deep Analysis Consensus

# Combine deep analyst votes
for vote in deep_votes:
weight = vote.confidence
classification_scores[vote.classification] += weight

# Determine if consensus reached
consensus_reached = (
vote_agreement >= 60% and
final_confidence >= 0.65
)

if consensus_reached:
return DEEP_ANALYSIS_APPROVED
else:
return HUMAN_REVIEW_REQUIRED

Pattern Detection Strategy

Primary Signal: Frontmatter Metadata

CODITECT documents use YAML frontmatter with type declarations:

---
type: agent
component_type: agent
---

Score Boost: +0.6 for matching frontmatter type

Secondary Signals: Content Patterns

PatternTypeScore
type: agentagent+0.6
You are a...agent+0.4
invocation: /xxxcommand+0.4
type: guideguide+0.6
## Step 1, ## Troubleshootingguide+0.3
## Context, ## Decisionadr+0.3

Tertiary Signals: Context

  • Directory location (agents/, commands/, guides/)
  • Sibling file patterns
  • Filename patterns (ADR-xxx, xxx-guide)

Usage

CLI Interface

# Single file
python classify.py docs/agents/my-agent.md

# Directory (recursive)
python classify.py docs/ -r

# With output
python classify.py docs/ -r --output results.json

# Update frontmatter with classifications
python classify.py docs/ -r --update-frontmatter

# Dry run
python classify.py docs/ -r --dry-run

Programmatic Usage

from core.orchestrator import create_default_orchestrator
from core.models import Document

orchestrator = create_default_orchestrator()
doc = Document.from_path("path/to/file.md")
result = orchestrator.classify(doc)

print(f"Type: {result.result.classification}")
print(f"Confidence: {result.result.confidence:.0%}")
print(f"Approval: {result.result.approval_type.value}")

Consequences

Positive

  • High Accuracy: 100% accuracy on documents with proper frontmatter
  • Transparency: Complete audit trail for every classification
  • Graceful Degradation: Uncertain documents flagged for human review
  • Extensible: Easy to add new document types or analysts
  • Fast: Parallel execution keeps latency low (<50ms typical)

Negative

  • Complexity: Multi-tier architecture adds maintenance overhead
  • Frontmatter Dependency: Best accuracy requires proper frontmatter
  • Memory: Loading all analysts increases memory footprint

Mitigations

  • Regular accuracy testing across document types
  • Human review queue for low-confidence classifications
  • Automated pattern retraining pipeline (future)

Compliance

This ADR is compliant with:

  • CODITECT Document Taxonomy (ADR-018)
  • Multi-Agent Orchestration Patterns (ADR-019)
  • CODITECT-CORE-STANDARDS documentation format

References


Changelog

VersionDateChanges
1.0.02025-12-28Initial ADR with complete MoE architecture
1.0.02025-12-28Pattern improvements: agents, commands, references, guides to 100% accuracy

Author: CODITECT Core Team Framework: CODITECT v1.7.2