Skip to main content

MoE Embeddings Comparison Report

Date: December 31, 2025 Status: COMPLETE Author: CODITECT Development System


Executive Summary

This report compares the MoE (Mixture of Experts) document classification system with and without semantic embeddings enabled. The --use-embeddings flag was implemented to allow users to choose between faster classification (no embeddings) and potentially more accurate classification (with embeddings).

Key Findings

MetricWithout EmbeddingsWith EmbeddingsDelta
Accuracy100%100%0%
Speed55.8 files/sec21.2 files/sec-62%
Judge Approval100%100%0%
Overall GradeAA-

Conclusion: Both methods achieve identical accuracy on well-structured documents with proper frontmatter. Embeddings provide marginal confidence boosts (+0.6% average) at the cost of 2.6x slower processing.


Implementation Details

New CLI Flags (v4)

# Enable semantic embeddings (requires venv with sentence-transformers)
python classify.py docs/ --use-embeddings

# Disable embeddings (default, faster)
python classify.py docs/ --no-embeddings

Venv Requirement Check

When --use-embeddings is requested without the proper environment:

======================================================================
EMBEDDINGS UNAVAILABLE - sentence-transformers not installed
======================================================================
To enable semantic embeddings, activate the virtual environment:

source .venv/bin/activate
pip install sentence-transformers

Then re-run with --use-embeddings
======================================================================

How Embeddings Enhance Classification

  1. Pre-classification hints: Embedding similarity provides type suggestions
  2. Confidence blending: When embedding agrees with classification, confidence is boosted (70% base + 30% embedding)
  3. Memory integration: Historical patterns from org.db (ADR-118) inform classification
  4. Calibration: Confidence scores are calibrated against actual accuracy

Test Results

Ground Truth Comparison

FileExpectedStandardWith Embeddings
classify.mdcommandcommand (59%) ✓command (59%) ✓
SKILL.mdskillskill (56%) ✓skill (56%) ✓
orchestrator.mdagentagent (95%) ✓agent (95%) ✓
ARCHITECTURE-OVERVIEW.mdreferencereference (53%) ✓reference (53%) ✓
WORKFLOW-GUIDE.mdguideguide (73%) ✓guide (73%) ✓
COMMERCE-CHECKOUT-WORKFLOWS.mdworkflowworkflow (84%) ✓workflow (84%) ✓

Confidence Comparison (docs/reference/)

FileNo-EmbWith-EmbDelta
AGENT-FRONTMATTER-THINKING-CONFIG.md90.8%90.8%0.0%
ARCHITECTURE-OVERVIEW.md53.3%53.3%0.0%
CLAUDE-4.5-GUIDE.md65.5%65.5%0.0%
COMPONENT-REFERENCE.md85.2%85.2%0.0%
MOE-ENHANCEMENT-IMPLEMENTATION-REPORT.md64.2%64.2%0.0%
MOE-ENHANCEMENT-RECOMMENDATIONS.md54.9%54.9%0.0%
MOE-SYSTEM-ANALYSIS.md37.0%42.1%+5.1%
README.md64.8%64.8%0.0%
AVERAGE64.5%65.1%+0.6%

MoE Judges Evaluation

JudgeApproval Rate
Frontmatter Judge100%
Directory Judge100%

All classifications were approved by both judges, confirming the MoE system's accuracy.


Performance Analysis

Speed Comparison

ModeProcessing SpeedRelative
Without Embeddings55.8 files/sec1.0x (baseline)
With Embeddings21.2 files/sec0.38x (2.6x slower)

Resource Usage

  • Without Embeddings: CPU-only, minimal memory
  • With Embeddings: GPU (MPS on Apple Silicon), ~500MB model in memory

Recommendations

When to Use Embeddings (--use-embeddings)

  1. Low-confidence documents: Files with confidence below 60%
  2. New document types: Content that doesn't match existing patterns
  3. Quality-critical classifications: When accuracy matters more than speed
  4. Research/analysis: Understanding classification reasoning

When to Skip Embeddings (default)

  1. Batch processing: Large document sets where speed matters
  2. Well-structured documents: Files with proper frontmatter declarations
  3. CI/CD pipelines: Automated workflows needing quick feedback
  4. Development iteration: Rapid testing and experimentation

Technical Details

Embedding Model

  • Model: all-MiniLM-L6-v2
  • Source: sentence-transformers
  • Device: MPS (Apple Silicon) or CPU fallback
  • Exemplar Types: 10 (agent, command, skill, guide, reference, workflow, adr, hook, script, config)

Confidence Blending Formula

if embedding_suggestion == classification:
blended = (base_confidence * 0.7) + (embedding_confidence * 0.3)
result.confidence = max(base_confidence, min(blended, 1.0))

Files Modified

  1. scripts/moe_classifier/classify.py - Added --use-embeddings flag
  2. scripts/moe_classifier/core/enhanced_orchestrator.py - Added classify() method

Conclusion

The semantic embeddings feature provides a useful enhancement for edge cases where traditional pattern matching falls short. However, for most CODITECT documents that follow proper frontmatter conventions, the base MoE system already achieves excellent accuracy.

Recommendation: Use --use-embeddings selectively for low-confidence files or when investigating classification issues. The default (no embeddings) is optimal for routine batch operations.


Report Generated: December 31, 2025 MoE Version: 4.0 (with embeddings support) Test Coverage: 14 files across 6 document types