moe enhancement
MoE Enhancement Skill
Patterns and best practices for enhancing Mixture of Experts classification systems.
When to Apply
Use this skill when:
- Improving classification accuracy in MoE systems
- Implementing learning loops for self-improvement
- Adding semantic embedding capabilities
- Calibrating confidence scores
- Optimizing threshold values
- Adding specialized judges or analysts
Pattern: Semantic Embedding Integration
Replace regex-based pattern matching with true semantic embeddings:
from sentence_transformers import SentenceTransformer
import numpy as np
class SemanticEmbeddingService:
def __init__(self, model_name: str = "all-MiniLM-L6-v2"):
self.model = SentenceTransformer(model_name)
self.exemplar_embeddings = self._compute_exemplars()
def classify(self, content: str) -> Tuple[str, float]:
doc_embedding = self.model.encode(content[:8000])
similarities = {}
for doc_type, exemplar in self.exemplar_embeddings.items():
similarities[doc_type] = float(np.dot(doc_embedding, exemplar) /
(np.linalg.norm(doc_embedding) * np.linalg.norm(exemplar)))
best = max(similarities, key=similarities.get)
return best, similarities[best]
Pattern: Historical Learning Loop
Track classification outcomes and adjust analyst weights:
class ClassificationLearner:
def record_classification(self, path, predicted, confidence, votes):
# Store in database for future analysis
pass
def confirm_classification(self, path, actual_type):
# Update analyst accuracy scores
# Recalculate dynamic weights
pass
def get_analyst_weights(self) -> Dict[str, float]:
# Return weights based on historical accuracy
pass
Pattern: Adaptive Thresholds
Self-tune thresholds based on escalation and accuracy rates:
class AdaptiveThresholdManager:
def __init__(self):
self.auto_approval = 0.90
self.target_escalation = 0.10
def adjust(self):
if self.escalation_rate > self.target_escalation * 1.5:
self.auto_approval -= 0.02 # Lower threshold
elif self.accuracy_rate < 0.90:
self.auto_approval += 0.02 # Raise threshold
Pattern: Confidence Calibration
Ensure confidence scores reflect actual accuracy:
class ConfidenceCalibrator:
def calibrate(self, raw_confidence: float) -> float:
# Apply histogram-based calibration
bin_idx = int(raw_confidence * 10)
actual_accuracy = self._get_bin_accuracy(bin_idx)
return actual_accuracy
Pattern: Specialized Judges
Add domain-specific validation:
class FrontmatterJudge(BaseJudge):
"""Validates against frontmatter type declarations."""
has_veto_authority = True
def evaluate(self, document, consensus):
declared = document.frontmatter.get('component_type')
if declared and declared != consensus.classification:
return JudgeDecision(approved=False, veto=True)
return JudgeDecision(approved=True)
Implementation Checklist
- Add semantic embedding service with fallback
- Implement learning database and tracking
- Create analyst accuracy tracking
- Add dynamic weight calculation
- Implement adaptive threshold adjustment
- Add confidence calibration
- Create specialized judges (frontmatter, directory, historical)
- Integrate all modules in enhanced orchestrator
- Add commands for calibration and learning
- Bootstrap from existing typed documents
Metrics to Track
| Metric | Target | Purpose |
|---|---|---|
| Auto-approval rate | ≥92% | Efficiency |
| Escalation rate | ≤8% | Minimize manual review |
| Accuracy rate | ≥95% | Quality |
| ECE (Calibration Error) | ≤0.05 | Confidence reliability |
Success Output
When successful, this skill MUST output:
✅ SKILL COMPLETE: moe-enhancement
Completed:
- [x] Semantic embedding service integrated (all-MiniLM-L6-v2 or better)
- [x] Historical learning database created and tracking classifications
- [x] Analyst accuracy tracking and dynamic weight calculation implemented
- [x] Adaptive threshold adjustment based on escalation and accuracy rates
- [x] Confidence calibration with histogram-based calibration curve
- [x] Specialized judges (frontmatter, directory, historical) added
Outputs:
- scripts/moe_classifier/core/embeddings.py
- scripts/moe_classifier/core/learning.py
- scripts/moe_classifier/core/adaptive_thresholds.py
- scripts/moe_classifier/core/calibration.py
- scripts/moe_classifier/core/enhanced_orchestrator.py
- /moe-calibrate command
- /moe-learn command
Metrics:
- Auto-approval rate: ≥92%
- Escalation rate: ≤8%
- Accuracy rate: ≥95%
- ECE (Expected Calibration Error): ≤0.05
Completion Checklist
Before marking this skill as complete, verify:
- Semantic embeddings working with cosine similarity >0.7 for correct classifications
- Learning database recording all classifications with outcomes
- Analyst weights updated based on historical accuracy (not static)
- Adaptive thresholds adjusting based on escalation rate (target 10%)
- Confidence scores calibrated (predicted confidence ≈ actual accuracy)
- Specialized judges have veto authority when appropriate
- Bootstrap process completed from existing typed documents
- All 4 target metrics met (auto-approval, escalation, accuracy, ECE)
Failure Indicators
This skill has FAILED if:
- ❌ Semantic embeddings not working (falling back to regex only)
- ❌ Learning database empty or not tracking classifications
- ❌ Analyst weights hardcoded (not dynamic)
- ❌ Thresholds static (not adapting to escalation rate)
- ❌ Confidence scores uncalibrated (90% confidence = 60% accuracy)
- ❌ Specialized judges not integrated or not firing
- ❌ Bootstrap not completed (no training data)
- ❌ Target metrics not met (<92% auto-approval, >8% escalation, <95% accuracy)
When NOT to Use
Do NOT use this skill when:
- MoE system is new (<100 classifications) - not enough data for learning
- Classification is simple binary decision (use rule-based classifier)
- No historical data available for bootstrap (need typed documents first)
- Real-time classification required (<100ms) - embeddings add latency
- No semantic similarity (classifications based on structure, not content)
- Static classification rules sufficient (no need for ML)
- Team unfamiliar with ML concepts (calibration, embeddings, confidence)
Use these alternatives instead:
- Simple binary: Rule-based classifier with if/else
- No history: Basic MoE with regex patterns only
- Real-time: Lightweight regex-only classifier
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| No fallback to regex | Embedding service fails → all classifications fail | Always have regex fallback |
| Hardcoded analyst weights | Weights don't adapt to accuracy | Use dynamic weights from historical data |
| Static thresholds | High escalation rate persists | Implement adaptive threshold manager |
| No calibration | Confidence scores misleading | Apply histogram-based calibration |
| Ignoring specialized judges | Frontmatter type conflicts go undetected | Give judges veto authority |
| No bootstrap process | Cold start problem (no training data) | Bootstrap from existing typed documents |
| No metric tracking | Can't measure improvement | Track all 4 metrics continuously |
Principles
This skill embodies:
- #1 Recycle Before Create - Use proven ML models (sentence-transformers) instead of custom embeddings
- #2 Progressive Disclosure - Start with regex, add embeddings, then learning, then calibration
- #5 Eliminate Ambiguity - Confidence calibration ensures scores match actual accuracy
- #7 Verification Protocol - Track metrics continuously to verify improvements
- #8 No Assumptions - Dynamic weights adapt to actual analyst performance
Full Standard: CODITECT-STANDARD-AUTOMATION.md
Related Components
scripts/moe_classifier/core/embeddings.pyscripts/moe_classifier/core/learning.pyscripts/moe_classifier/core/adaptive_thresholds.pyscripts/moe_classifier/core/calibration.pyscripts/moe_classifier/core/enhanced_orchestrator.py/moe-calibratecommand/moe-learncommand