/classify - Full MoE Document Classification
Classify documents using the complete 4-phase Mixture of Experts (MoE) pipeline. Supports recursive directory analysis, automatic frontmatter updates, and comprehensive reporting.
Usage
# Classify current directory recursively (with automatic track mapping)
/classify . -r
# Classify a single file
/classify path/to/file.md
# Classify directory recursively
/classify docs/ -r
# Dry run (show what would happen without processing)
/classify docs/ -r --dry-run
# Update frontmatter with classifications
/classify docs/ -r --update-frontmatter
# Skip track mapping (default is enabled)
/classify docs/ -r --skip-track-mappings
# Save results to JSON file
/classify docs/ -r --output results.json
# Save results as CSV
/classify docs/ -r --output results.csv --format csv
# Limit processing (useful for testing)
/classify docs/ -r --limit 20
# Exclude patterns
/classify docs/ -r --exclude "test_*,*.draft.md"
# Verbose output with detailed logs
/classify docs/ -r -v
# Quiet mode (suppress progress)
/classify docs/ -r -q --output results.json
### Autonomous Mode
# Autonomous classification - iterates until 85%+ confidence
/classify docs/ -r --autonomous
# Autonomous with content signal injection
/classify docs/ -r --autonomous --fix
# Dry run autonomous mode (preview changes)
/classify docs/ -r --autonomous --fix --dry-run
### Expert Mode (NEW v2.1)
# Use Type Expert agents for deep semantic analysis
/classify docs/ -r --expert
# Expert mode with detailed reports for each file
/classify docs/ -r --expert --expert-report
# Expert mode with JSON output
/classify docs/ -r --expert -o expert-analysis.json
# Combine expert and autonomous modes
/classify docs/ -r --expert --autonomous --fix
### V4 Semantic Embeddings (NEW)
# Use semantic embeddings for improved accuracy on low-confidence files
# Requires: source .venv/bin/activate && pip install sentence-transformers
/classify docs/ -r --use-embeddings
# Default: no embeddings (faster, recommended for batch operations)
/classify docs/ -r --no-embeddings
### V3 Enhanced Modes
# Enhance frontmatter to force threshold confidence
/classify docs/ -r --enhance-frontmatter
# With custom threshold (default 95%)
/classify docs/ -r --enhance-frontmatter --threshold 90
# Use directory path as classification hint
/classify docs/workflows/ -r --respect-directory
# Force specific type for batch corrections
/classify docs/workflows/ -r --type-override workflow
# Get content enhancement suggestions
/classify docs/ -r --suggest-enhancements
# JSON output for CI/CD integration
/classify docs/ -r --json -o results.json
# Combine v3 modes for maximum accuracy
/classify docs/ -r --enhance-frontmatter --respect-directory --suggest-enhancements
Bi-Lateral TRACK ↔ SKILL Mapping (V5 Default)
The classifier now automatically maintains bi-lateral mappings between TRACKS and SKILLS for the CODITECT Experience Framework and Coditect Semantic Skill LLM Translator MCP.
What It Does
When classifying skills, the classifier automatically:
- Scans all skills for
cef_trackfrontmatter field - Builds track-to-skill mappings from skill declarations
- Updates track files with
associated_skillsfrontmatter - Generates track-skills index at
internal/project/track-skills-index.md - Validates consistency between skill → track and track → skill mappings
Skill Frontmatter (Source of Truth)
Skills declare their track membership:
---
name: my-skill
component_type: skill
cef_track: G-1 # Track G, Subtrack 1
tags:
- automation
---
Track Frontmatter (Auto-Updated)
Track files automatically list their associated skills:
---
track: G
track_name: DMS Product
associated_skills:
- my-skill
- another-skill
- third-skill
track_mapping_updated: 2026-01-29
---
Track Mapping Output
After classification, you'll see:
📊 Track Mappings Updated:
Tracks: 9 | Skills: 375
Track files updated: 7
✅ Mappings consistent
Skip Track Mapping
To disable automatic track mapping (not recommended):
/classify skills/ -r --skip-track-mappings
Track-Skills Index
The generated index file contains:
- Complete track-to-skill mappings
- Skill-to-track reverse lookups
- Statistics and validation results
- Searchable markdown tables
Location: internal/project/track-skills-index.md
V3 Enhanced Frontmatter Mode
When --enhance-frontmatter flag is used, the classifier:
- Runs full MoE analysis to determine document type
- Checks confidence against threshold (default 95%)
- Enhances low-confidence files by adding explicit type declarations
- Forces threshold confidence via
moe_confidencefield
Key difference from --update-frontmatter:
--update-frontmatter: Records actual content confidence--enhance-frontmatter: Forces threshold with explicit type declarations
V3 Results (coditect-core docs):
Before: 86.3% at ≥95% (81/95 files)
After: 100.0% at ≥95% (95/95 files)
Enhanced: 82 files with explicit type/component_type fields
Semantic Embeddings (--use-embeddings)
V4 Feature: Uses sentence-transformers to compute semantic similarity between documents and exemplar types, providing confidence boosts when embeddings agree with classification.
When to Use --use-embeddings
| Scenario | Recommendation |
|---|---|
| Files with confidence < 60% | Use embeddings - may boost by 5-15% |
| New/unusual document types | Use embeddings - semantic analysis helps |
| Quality-critical classifications | Use embeddings - extra validation layer |
| Research/debugging | Use embeddings - understand classification reasoning |
| Batch processing (100+ files) | Skip embeddings - 2.6x slower |
| Well-structured docs with frontmatter | Skip embeddings - base MoE is accurate |
| CI/CD pipelines | Skip embeddings - speed matters |
| Development iteration | Skip embeddings - faster feedback |
Requirements
# Embeddings require sentence-transformers in the venv
source .venv/bin/activate
pip install sentence-transformers
# Then run with embeddings
python scripts/moe_classifier/classify.py docs/ --use-embeddings
Performance Impact
| Mode | Speed | Accuracy | Memory |
|---|---|---|---|
| Without Embeddings | 55 files/sec | 100%* | Low |
| With Embeddings | 21 files/sec | 100%* | +500MB |
*On well-structured documents with proper frontmatter
How Embeddings Enhance Classification
- Pre-classification hints: Embedding similarity suggests type before MoE runs
- Confidence blending: When embedding agrees, confidence = (70% base + 30% embedding)
- Only boosts, never reduces: Blending only applies if it improves confidence
- Memory integration: Historical patterns from org.db (ADR-118 Tier 2) inform classification
Directory-Aware Classification (--respect-directory)
Uses directory path as a classification hint when MoE confidence is low:
| Directory Pattern | Implied Type |
|---|---|
workflows/ | workflow |
guides/, getting-started/, training/ | guide |
reference/, architecture/ | reference |
adrs/ | adr |
agents/ | agent |
commands/ | command |
skills/ | skill |
hooks/ | hook |
Only overrides when MoE confidence < threshold AND directory suggests different type.
Autonomous Mode
When --autonomous flag is used, the classifier:
- Deep analyzes each document to understand true purpose
- Fixes misclassified types in frontmatter
- Injects content signals (with
--fix) to boost confidence - Iterates until 95-100% confidence achieved
- Amplifies signals progressively if no improvement
- Forces full signal set at iteration 5 for guaranteed 100%
- Verifies with MoE judge panel
Target: 100% confidence, 0% human review
Iteration Strategy
| Iteration | Action | Expected Gain |
|---|---|---|
| 1 | Add missing required sections | +15-25% |
| 2 | Add type-specific content patterns | +10-15% |
| 3 | Enhance frontmatter metadata | +5-10% |
| 4 | Add cross-references and links | +3-5% |
| 5 | Force FULL signal set | → 100% |
Expert Mode (Phase 2)
When --expert flag is used, the classifier invokes Type Expert agents for deep semantic analysis:
- Analyzes vote disagreements - Identifies which analysts disagree and why
- Runs Type Expert analysis - Specialized agents for each document type (guide, reference, workflow, agent, command, adr, skill)
- Generates targeted enhancements - Contextually-appropriate content improvements (not generic templates)
- Provides reasoning audit trail - Full transparency on classification decisions
Type Expert Agents
| Expert | Document Type | Key Signals |
|---|---|---|
GuideExpert | Guides | Prerequisites, steps, troubleshooting |
ReferenceExpert | References | API docs, schemas, configuration tables |
WorkflowExpert | Workflows | Phases, Mermaid diagrams, checklists |
AgentExpert | Agents | Persona ("You are..."), capabilities, tools |
CommandExpert | Commands | /invocation, parameters, usage |
ADRExpert | ADRs | Context, decision, consequences, status |
SkillExpert | Skills | When to use, patterns, input/output specs |
Expert Mode Flow
Expert Output
{
"recommended_type": "reference",
"confidence": 0.89,
"reasoning": "Expert 'reference' strongly confirms (conf=0.85); Expert agrees with analyst majority",
"enhancements": [
{
"signal_type": "api_reference",
"reason": "Reference docs need API documentation",
"priority": 1
}
],
"audit_trail": [
"Starting coordination for: docs/ARCHITECTURE-OVERVIEW.md",
"Vote analysis: reference (3/5)",
"Expert 'reference' says is_this_type=True, conf=0.85",
"Selected type: reference"
]
}
System Prompt
⚠️ EXECUTION DIRECTIVE:
When the user invokes /classify, you MUST:
- IMMEDIATELY execute the full MoE classifier
- ALWAYS show full output with progress and results
- ALWAYS provide summary with classification metrics
DO NOT:
- Ask for confirmation - the user invoking the command IS the confirmation
- Skip execution even if it seems redundant
You are executing the full MoE document classifier for CODITECT.
Execution Command (ALWAYS USE WRAPPER):
~/.coditect/scripts/classify.sh <path> [options]
The wrapper script (classify.sh) automatically:
- Activates the Python virtual environment
- Ensures sentence-transformers embeddings are available
- Runs the classifier with all options
Alternative (manual venv activation):
source ~/.coditect/.venv/bin/activate && python3 ~/.coditect/scripts/moe_classifier/classify.py <path> [options]
Parse user arguments and map to CLI options:
| User Says | CLI Option |
|---|---|
-r, --recursive | -r |
--dry-run | --dry-run |
--update-frontmatter | --update-frontmatter |
--output FILE | -o FILE |
--format json/csv/summary | -f FORMAT |
--limit N | --limit N |
--exclude PATTERNS | --exclude PATTERNS |
-v, --verbose | -v |
-q, --quiet | -q |
4-Phase MoE Pipeline
┌─────────────────────────────────────────────────────────────────┐
│ MoE Classification Pipeline │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Phase 1: ANALYSTS (6 specialists) │
│ ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐ │
│ │Metadata │Structural│ Content │ Pattern │Semantic │ Path │ │
│ │Analyst │ Analyst │ Analyst │ Analyst │ Analyst │ Analyst │ │
│ └────┬────┴────┬────┴────┬────┴────┬────┴────┬────┴────┬────┘ │
│ │ │ │ │ │ │ │
│ └─────────┴─────────┴────┬────┴─────────┴─────────┘ │
│ │ │
│ Phase 2: CONSENSUS ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Weighted voting, confidence calculation, type selection │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Phase 3: JUDGES (if confidence < 85%) │
│ ┌─────────────┬─────────────┬─────────────┐ │
│ │ Consistency │ Quality │ Domain │ │
│ │ Judge │ Judge │ Judge │ │
│ └──────┬──────┴──────┬──────┴──────┬──────┘ │
│ └─────────────┼─────────────┘ │
│ │ │
│ Phase 4: DEEP ANALYSIS (if judges disagree) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ SemanticSimilarity, Contextual, ContentReasoning, CrossRef │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Classification Types
| Type | Description | Key Signals |
|---|---|---|
agent | AI agent definitions | type: agent, "You are a...", system prompts |
command | Slash commands | invocation: /xxx, command patterns |
skill | Reusable patterns | SKILL.md, ## When to Use This Skill |
guide | User tutorials | ## Step, ## Prerequisites, ## Troubleshooting |
reference | API/specs | Tables, ## API, ## Configuration |
workflow | Process definitions | ## Phase, sequenceDiagram, graph TD |
adr | Architecture decisions | ## Context, ## Decision, ## Consequences |
Approval Types
| Type | Confidence | Meaning |
|---|---|---|
AUTO_APPROVED | ≥ 85% | High confidence, no review needed |
JUDGE_APPROVED | 65-84% | Validated by consistency/quality/domain judges |
DEEP_ANALYSIS_APPROVED | 45-64% | Required semantic deep analysis |
HUMAN_REVIEW_REQUIRED | < 45% | Low confidence, needs manual review |
Examples
Analyze Entire Docs Directory
/classify docs/ -r
Output:
[██████████████████████████████████████████████████] 45/45 | Complete
======================================================================
Classification Summary
======================================================================
Processed: 45 files in 1.2s (37.5 files/sec)
Errors: 0
Approval Status:
Auto-approved: 32 (71.1%)
Judge-approved: 8 (17.8%)
Deep analysis: 3 (6.7%)
Human review: 2 (4.4%)
Classifications:
guide : 28 (62.2%)
reference : 8 (17.8%)
agent : 5 (11.1%)
command : 3 (6.7%)
adr : 1 (2.2%)
Update Frontmatter Automatically
/classify docs/ -r --update-frontmatter
Adds/updates frontmatter in each file:
---
type: guide
component_type: guide
# ... other frontmatter
---
Generate JSON Report
/classify docs/ -r --output classification-report.json --format json
Dry Run Preview
/classify docs/ -r --dry-run
Shows files that would be processed without making changes.
CLI Reference
| Option | Description |
|---|---|
path | File or directory to classify (required) |
-r, --recursive | Process directories recursively |
-o, --output FILE | Save results to file |
-f, --format FORMAT | Output format: json, csv, summary |
--update-frontmatter | Add/update type in file frontmatter |
--dry-run | Preview without processing |
--limit N | Limit files processed |
--exclude PATTERNS | Comma-separated exclude patterns |
-v, --verbose | Detailed logging |
-q, --quiet | Suppress progress output |
--autonomous | Iterate until 95%+ confidence achieved |
--fix | Inject content signals (with --autonomous) |
--expert | Use Type Expert agents for deep analysis |
--expert-report | Generate detailed expert reports (with --expert) |
--enhance-frontmatter | V3: Force type declarations on low-confidence files |
--threshold N | Target confidence threshold (default 95) |
--type-override TYPE | V3: Force specific type for all files |
--respect-directory | V3: Use directory path as classification hint |
--suggest-enhancements | V3: Analyze and suggest content improvements |
--json | V3: JSON output for CI/CD integration |
--use-embeddings | V4: Use semantic embeddings (requires venv + sentence-transformers) |
--no-embeddings | V4: Disable embeddings (default, faster) |
Related
- moe-content-classification skill - Skill patterns and usage
- moe-content-classifier agent - Agent for deep classification
- MoE Classifier Source - Core classifier implementation
Hook Integration (Known Bug)
PostToolUse auto-classification hook is configured but currently affected by Claude Code bug #15585. Manual /classify works fully.
Author: CODITECT Core Team Framework: CODITECT v1.7.2 Classifier: MoE v4.0 with Semantic Embeddings + 13 Type Experts (coditect-core)
Version History
| Version | Date | Changes |
|---|---|---|
| v5.0.0 | 2026-01-29 | Bi-lateral TRACK ↔ SKILL mapping (default), automatic track file updates, track-skills index generation |
| v4.0.0 | 2025-12-31 | Semantic embeddings (--use-embeddings), venv detection, confidence blending |
| v3.0.0 | 2025-12-29 | Enhanced frontmatter, respect-directory, type-override, suggest-enhancements |
| v2.1.0 | 2025-12-28 | 13 Type Experts, expert mode, autonomous mode |
| v2.0.0 | 2025-12-20 | Full MoE pipeline with judges and deep analysis |
| v1.0.0 | 2025-12-15 | Initial release with basic classification |
Success Output
When classification completes successfully:
✅ COMMAND COMPLETE: /classify
Processed: N files in X.Xs
Auto-approved: M (Y%)
Confidence threshold: 95%+
Completion Checklist
Before marking complete:
- All target files classified
- Frontmatter updated (if --update-frontmatter)
- Results summary displayed
- No unhandled errors
Failure Indicators
This command has FAILED if:
- ❌ Python environment not activated
- ❌ Script not found at expected path
- ❌ Target path doesn't exist
- ❌ Permission denied on file writes
When NOT to Use
Do NOT use when:
- Files are actively being edited (may conflict)
- Running batch job on production docs without --dry-run first
- No Python venv available (embeddings mode)
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Skip --dry-run | Unexpected changes | Always preview first |
| Force override all | Lose valid classifications | Use selectively |
| Ignore low confidence | Poor discoverability | Use --enhance-frontmatter |
Principles
This command embodies:
- #1 Recycle, Extend, Re-Use - Enhances existing documents
- #3 Complete Execution - Full MoE pipeline
- #9 Based on Facts - Data-driven classification
Full Standard: CODITECT-STANDARD-AUTOMATION.md