Skip to main content

moe enhancement

MoE Enhancement Skill

Patterns and best practices for enhancing Mixture of Experts classification systems.

When to Apply

Use this skill when:

  • Improving classification accuracy in MoE systems
  • Implementing learning loops for self-improvement
  • Adding semantic embedding capabilities
  • Calibrating confidence scores
  • Optimizing threshold values
  • Adding specialized judges or analysts

Pattern: Semantic Embedding Integration

Replace regex-based pattern matching with true semantic embeddings:

from sentence_transformers import SentenceTransformer
import numpy as np

class SemanticEmbeddingService:
def __init__(self, model_name: str = "all-MiniLM-L6-v2"):
self.model = SentenceTransformer(model_name)
self.exemplar_embeddings = self._compute_exemplars()

def classify(self, content: str) -> Tuple[str, float]:
doc_embedding = self.model.encode(content[:8000])
similarities = {}
for doc_type, exemplar in self.exemplar_embeddings.items():
similarities[doc_type] = float(np.dot(doc_embedding, exemplar) /
(np.linalg.norm(doc_embedding) * np.linalg.norm(exemplar)))
best = max(similarities, key=similarities.get)
return best, similarities[best]

Pattern: Historical Learning Loop

Track classification outcomes and adjust analyst weights:

class ClassificationLearner:
def record_classification(self, path, predicted, confidence, votes):
# Store in database for future analysis
pass

def confirm_classification(self, path, actual_type):
# Update analyst accuracy scores
# Recalculate dynamic weights
pass

def get_analyst_weights(self) -> Dict[str, float]:
# Return weights based on historical accuracy
pass

Pattern: Adaptive Thresholds

Self-tune thresholds based on escalation and accuracy rates:

class AdaptiveThresholdManager:
def __init__(self):
self.auto_approval = 0.90
self.target_escalation = 0.10

def adjust(self):
if self.escalation_rate > self.target_escalation * 1.5:
self.auto_approval -= 0.02 # Lower threshold
elif self.accuracy_rate < 0.90:
self.auto_approval += 0.02 # Raise threshold

Pattern: Confidence Calibration

Ensure confidence scores reflect actual accuracy:

class ConfidenceCalibrator:
def calibrate(self, raw_confidence: float) -> float:
# Apply histogram-based calibration
bin_idx = int(raw_confidence * 10)
actual_accuracy = self._get_bin_accuracy(bin_idx)
return actual_accuracy

Pattern: Specialized Judges

Add domain-specific validation:

class FrontmatterJudge(BaseJudge):
"""Validates against frontmatter type declarations."""
has_veto_authority = True

def evaluate(self, document, consensus):
declared = document.frontmatter.get('component_type')
if declared and declared != consensus.classification:
return JudgeDecision(approved=False, veto=True)
return JudgeDecision(approved=True)

Implementation Checklist

  • Add semantic embedding service with fallback
  • Implement learning database and tracking
  • Create analyst accuracy tracking
  • Add dynamic weight calculation
  • Implement adaptive threshold adjustment
  • Add confidence calibration
  • Create specialized judges (frontmatter, directory, historical)
  • Integrate all modules in enhanced orchestrator
  • Add commands for calibration and learning
  • Bootstrap from existing typed documents

Metrics to Track

MetricTargetPurpose
Auto-approval rate≥92%Efficiency
Escalation rate≤8%Minimize manual review
Accuracy rate≥95%Quality
ECE (Calibration Error)≤0.05Confidence reliability

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: moe-enhancement

Completed:
- [x] Semantic embedding service integrated (all-MiniLM-L6-v2 or better)
- [x] Historical learning database created and tracking classifications
- [x] Analyst accuracy tracking and dynamic weight calculation implemented
- [x] Adaptive threshold adjustment based on escalation and accuracy rates
- [x] Confidence calibration with histogram-based calibration curve
- [x] Specialized judges (frontmatter, directory, historical) added

Outputs:
- scripts/moe_classifier/core/embeddings.py
- scripts/moe_classifier/core/learning.py
- scripts/moe_classifier/core/adaptive_thresholds.py
- scripts/moe_classifier/core/calibration.py
- scripts/moe_classifier/core/enhanced_orchestrator.py
- /moe-calibrate command
- /moe-learn command

Metrics:
- Auto-approval rate: ≥92%
- Escalation rate: ≤8%
- Accuracy rate: ≥95%
- ECE (Expected Calibration Error): ≤0.05

Completion Checklist

Before marking this skill as complete, verify:

  • Semantic embeddings working with cosine similarity >0.7 for correct classifications
  • Learning database recording all classifications with outcomes
  • Analyst weights updated based on historical accuracy (not static)
  • Adaptive thresholds adjusting based on escalation rate (target 10%)
  • Confidence scores calibrated (predicted confidence ≈ actual accuracy)
  • Specialized judges have veto authority when appropriate
  • Bootstrap process completed from existing typed documents
  • All 4 target metrics met (auto-approval, escalation, accuracy, ECE)

Failure Indicators

This skill has FAILED if:

  • ❌ Semantic embeddings not working (falling back to regex only)
  • ❌ Learning database empty or not tracking classifications
  • ❌ Analyst weights hardcoded (not dynamic)
  • ❌ Thresholds static (not adapting to escalation rate)
  • ❌ Confidence scores uncalibrated (90% confidence = 60% accuracy)
  • ❌ Specialized judges not integrated or not firing
  • ❌ Bootstrap not completed (no training data)
  • ❌ Target metrics not met (<92% auto-approval, >8% escalation, <95% accuracy)

When NOT to Use

Do NOT use this skill when:

  • MoE system is new (<100 classifications) - not enough data for learning
  • Classification is simple binary decision (use rule-based classifier)
  • No historical data available for bootstrap (need typed documents first)
  • Real-time classification required (<100ms) - embeddings add latency
  • No semantic similarity (classifications based on structure, not content)
  • Static classification rules sufficient (no need for ML)
  • Team unfamiliar with ML concepts (calibration, embeddings, confidence)

Use these alternatives instead:

  • Simple binary: Rule-based classifier with if/else
  • No history: Basic MoE with regex patterns only
  • Real-time: Lightweight regex-only classifier

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
No fallback to regexEmbedding service fails → all classifications failAlways have regex fallback
Hardcoded analyst weightsWeights don't adapt to accuracyUse dynamic weights from historical data
Static thresholdsHigh escalation rate persistsImplement adaptive threshold manager
No calibrationConfidence scores misleadingApply histogram-based calibration
Ignoring specialized judgesFrontmatter type conflicts go undetectedGive judges veto authority
No bootstrap processCold start problem (no training data)Bootstrap from existing typed documents
No metric trackingCan't measure improvementTrack all 4 metrics continuously

Principles

This skill embodies:

  • #1 Recycle Before Create - Use proven ML models (sentence-transformers) instead of custom embeddings
  • #2 Progressive Disclosure - Start with regex, add embeddings, then learning, then calibration
  • #5 Eliminate Ambiguity - Confidence calibration ensures scores match actual accuracy
  • #7 Verification Protocol - Track metrics continuously to verify improvements
  • #8 No Assumptions - Dynamic weights adapt to actual analyst performance

Full Standard: CODITECT-STANDARD-AUTOMATION.md

  • scripts/moe_classifier/core/embeddings.py
  • scripts/moe_classifier/core/learning.py
  • scripts/moe_classifier/core/adaptive_thresholds.py
  • scripts/moe_classifier/core/calibration.py
  • scripts/moe_classifier/core/enhanced_orchestrator.py
  • /moe-calibrate command
  • /moe-learn command