MoE Embeddings Comparison Report

Date: December 31, 2025 Status: COMPLETE Author: CODITECT Development System

Executive Summary

This report compares the MoE (Mixture of Experts) document classification system with and without semantic embeddings enabled. The --use-embeddings flag was implemented to allow users to choose between faster classification (no embeddings) and potentially more accurate classification (with embeddings).

Key Findings

Metric	Without Embeddings	With Embeddings	Delta
Accuracy	100%	100%	0%
Speed	55.8 files/sec	21.2 files/sec	-62%
Judge Approval	100%	100%	0%
Overall Grade	A	A	-

Conclusion: Both methods achieve identical accuracy on well-structured documents with proper frontmatter. Embeddings provide marginal confidence boosts (+0.6% average) at the cost of 2.6x slower processing.

Implementation Details

New CLI Flags (v4)

# Enable semantic embeddings (requires venv with sentence-transformers)
python classify.py docs/ --use-embeddings

# Disable embeddings (default, faster)
python classify.py docs/ --no-embeddings

Venv Requirement Check

When --use-embeddings is requested without the proper environment:

======================================================================
EMBEDDINGS UNAVAILABLE - sentence-transformers not installed
======================================================================
To enable semantic embeddings, activate the virtual environment:

    source .venv/bin/activate
    pip install sentence-transformers

Then re-run with --use-embeddings
======================================================================

How Embeddings Enhance Classification

Pre-classification hints: Embedding similarity provides type suggestions
Confidence blending: When embedding agrees with classification, confidence is boosted (70% base + 30% embedding)
Memory integration: Historical patterns from org.db (ADR-118) inform classification
Calibration: Confidence scores are calibrated against actual accuracy

Test Results

Ground Truth Comparison

File	Expected	Standard	With Embeddings
classify.md	command	command (59%) ✓	command (59%) ✓
SKILL.md	skill	skill (56%) ✓	skill (56%) ✓
orchestrator.md	agent	agent (95%) ✓	agent (95%) ✓
ARCHITECTURE-OVERVIEW.md	reference	reference (53%) ✓	reference (53%) ✓
WORKFLOW-GUIDE.md	guide	guide (73%) ✓	guide (73%) ✓
COMMERCE-CHECKOUT-WORKFLOWS.md	workflow	workflow (84%) ✓	workflow (84%) ✓

Confidence Comparison (docs/reference/)

File	No-Emb	With-Emb	Delta
AGENT-FRONTMATTER-THINKING-CONFIG.md	90.8%	90.8%	0.0%
ARCHITECTURE-OVERVIEW.md	53.3%	53.3%	0.0%
CLAUDE-4.5-GUIDE.md	65.5%	65.5%	0.0%
COMPONENT-REFERENCE.md	85.2%	85.2%	0.0%
MOE-ENHANCEMENT-IMPLEMENTATION-REPORT.md	64.2%	64.2%	0.0%
MOE-ENHANCEMENT-RECOMMENDATIONS.md	54.9%	54.9%	0.0%
MOE-SYSTEM-ANALYSIS.md	37.0%	42.1%	+5.1%
README.md	64.8%	64.8%	0.0%
AVERAGE	64.5%	65.1%	+0.6%

MoE Judges Evaluation

Judge	Approval Rate
Frontmatter Judge	100%
Directory Judge	100%

All classifications were approved by both judges, confirming the MoE system's accuracy.

Performance Analysis

Speed Comparison

Mode	Processing Speed	Relative
Without Embeddings	55.8 files/sec	1.0x (baseline)
With Embeddings	21.2 files/sec	0.38x (2.6x slower)

Resource Usage

Without Embeddings: CPU-only, minimal memory
With Embeddings: GPU (MPS on Apple Silicon), ~500MB model in memory

Recommendations

When to Use Embeddings (`--use-embeddings`)

Low-confidence documents: Files with confidence below 60%
New document types: Content that doesn't match existing patterns
Quality-critical classifications: When accuracy matters more than speed
Research/analysis: Understanding classification reasoning

When to Skip Embeddings (default)

Batch processing: Large document sets where speed matters
Well-structured documents: Files with proper frontmatter declarations
CI/CD pipelines: Automated workflows needing quick feedback
Development iteration: Rapid testing and experimentation

Technical Details

Embedding Model

Model: all-MiniLM-L6-v2
Source: sentence-transformers
Device: MPS (Apple Silicon) or CPU fallback
Exemplar Types: 10 (agent, command, skill, guide, reference, workflow, adr, hook, script, config)

Confidence Blending Formula

if embedding_suggestion == classification:
    blended = (base_confidence * 0.7) + (embedding_confidence * 0.3)
    result.confidence = max(base_confidence, min(blended, 1.0))

Files Modified

scripts/moe_classifier/classify.py - Added --use-embeddings flag
scripts/moe_classifier/core/enhanced_orchestrator.py - Added classify() method

Conclusion

The semantic embeddings feature provides a useful enhancement for edge cases where traditional pattern matching falls short. However, for most CODITECT documents that follow proper frontmatter conventions, the base MoE system already achieves excellent accuracy.

Recommendation: Use --use-embeddings selectively for low-confidence files or when investigating classification issues. The default (no embeddings) is optimal for routine batch operations.

Report Generated: December 31, 2025 MoE Version: 4.0 (with embeddings support) Test Coverage: 14 files across 6 document types

Executive Summary​

Key Findings​

Implementation Details​

New CLI Flags (v4)​

Venv Requirement Check​

How Embeddings Enhance Classification​

Test Results​

Ground Truth Comparison​

Confidence Comparison (docs/reference/)​

MoE Judges Evaluation​

Performance Analysis​

Speed Comparison​

Resource Usage​

Recommendations​

When to Use Embeddings (--use-embeddings)​

When to Skip Embeddings (default)​

Technical Details​

Embedding Model​

Confidence Blending Formula​

Files Modified​

Conclusion​