MoE Embeddings Comparison Report
MoE Embeddings Comparison Report
Date: December 31, 2025 Status: COMPLETE Author: CODITECT Development System
Executive Summary
This report compares the MoE (Mixture of Experts) document classification system with and without semantic embeddings enabled. The --use-embeddings flag was implemented to allow users to choose between faster classification (no embeddings) and potentially more accurate classification (with embeddings).
Key Findings
| Metric | Without Embeddings | With Embeddings | Delta |
|---|---|---|---|
| Accuracy | 100% | 100% | 0% |
| Speed | 55.8 files/sec | 21.2 files/sec | -62% |
| Judge Approval | 100% | 100% | 0% |
| Overall Grade | A | A | - |
Conclusion: Both methods achieve identical accuracy on well-structured documents with proper frontmatter. Embeddings provide marginal confidence boosts (+0.6% average) at the cost of 2.6x slower processing.
Implementation Details
New CLI Flags (v4)
# Enable semantic embeddings (requires venv with sentence-transformers)
python classify.py docs/ --use-embeddings
# Disable embeddings (default, faster)
python classify.py docs/ --no-embeddings
Venv Requirement Check
When --use-embeddings is requested without the proper environment:
======================================================================
EMBEDDINGS UNAVAILABLE - sentence-transformers not installed
======================================================================
To enable semantic embeddings, activate the virtual environment:
source .venv/bin/activate
pip install sentence-transformers
Then re-run with --use-embeddings
======================================================================
How Embeddings Enhance Classification
- Pre-classification hints: Embedding similarity provides type suggestions
- Confidence blending: When embedding agrees with classification, confidence is boosted (70% base + 30% embedding)
- Memory integration: Historical patterns from context.db inform classification
- Calibration: Confidence scores are calibrated against actual accuracy
Test Results
Ground Truth Comparison
| File | Expected | Standard | With Embeddings |
|---|---|---|---|
| classify.md | command | command (59%) ✓ | command (59%) ✓ |
| SKILL.md | skill | skill (56%) ✓ | skill (56%) ✓ |
| orchestrator.md | agent | agent (95%) ✓ | agent (95%) ✓ |
| ARCHITECTURE-OVERVIEW.md | reference | reference (53%) ✓ | reference (53%) ✓ |
| WORKFLOW-GUIDE.md | guide | guide (73%) ✓ | guide (73%) ✓ |
| COMMERCE-CHECKOUT-WORKFLOWS.md | workflow | workflow (84%) ✓ | workflow (84%) ✓ |
Confidence Comparison (docs/reference/)
| File | No-Emb | With-Emb | Delta |
|---|---|---|---|
| AGENT-FRONTMATTER-THINKING-CONFIG.md | 90.8% | 90.8% | 0.0% |
| ARCHITECTURE-OVERVIEW.md | 53.3% | 53.3% | 0.0% |
| CLAUDE-4.5-GUIDE.md | 65.5% | 65.5% | 0.0% |
| COMPONENT-REFERENCE.md | 85.2% | 85.2% | 0.0% |
| MOE-ENHANCEMENT-IMPLEMENTATION-REPORT.md | 64.2% | 64.2% | 0.0% |
| MOE-ENHANCEMENT-RECOMMENDATIONS.md | 54.9% | 54.9% | 0.0% |
| MOE-SYSTEM-ANALYSIS.md | 37.0% | 42.1% | +5.1% |
| README.md | 64.8% | 64.8% | 0.0% |
| AVERAGE | 64.5% | 65.1% | +0.6% |
MoE Judges Evaluation
| Judge | Approval Rate |
|---|---|
| Frontmatter Judge | 100% |
| Directory Judge | 100% |
All classifications were approved by both judges, confirming the MoE system's accuracy.
Performance Analysis
Speed Comparison
| Mode | Processing Speed | Relative |
|---|---|---|
| Without Embeddings | 55.8 files/sec | 1.0x (baseline) |
| With Embeddings | 21.2 files/sec | 0.38x (2.6x slower) |
Resource Usage
- Without Embeddings: CPU-only, minimal memory
- With Embeddings: GPU (MPS on Apple Silicon), ~500MB model in memory
Recommendations
When to Use Embeddings (--use-embeddings)
- Low-confidence documents: Files with confidence below 60%
- New document types: Content that doesn't match existing patterns
- Quality-critical classifications: When accuracy matters more than speed
- Research/analysis: Understanding classification reasoning
When to Skip Embeddings (default)
- Batch processing: Large document sets where speed matters
- Well-structured documents: Files with proper frontmatter declarations
- CI/CD pipelines: Automated workflows needing quick feedback
- Development iteration: Rapid testing and experimentation
Technical Details
Embedding Model
- Model: all-MiniLM-L6-v2
- Source: sentence-transformers
- Device: MPS (Apple Silicon) or CPU fallback
- Exemplar Types: 10 (agent, command, skill, guide, reference, workflow, adr, hook, script, config)
Confidence Blending Formula
if embedding_suggestion == classification:
blended = (base_confidence * 0.7) + (embedding_confidence * 0.3)
result.confidence = max(base_confidence, min(blended, 1.0))
Files Modified
scripts/moe_classifier/classify.py- Added--use-embeddingsflagscripts/moe_classifier/core/enhanced_orchestrator.py- Addedclassify()method
Conclusion
The semantic embeddings feature provides a useful enhancement for edge cases where traditional pattern matching falls short. However, for most CODITECT documents that follow proper frontmatter conventions, the base MoE system already achieves excellent accuracy.
Recommendation: Use --use-embeddings selectively for low-confidence files or when investigating classification issues. The default (no embeddings) is optimal for routine batch operations.
Report Generated: December 31, 2025 MoE Version: 4.0 (with embeddings support) Test Coverage: 14 files across 6 document types