/classify - Full MoE Document Classification

Classify documents using the complete 4-phase Mixture of Experts (MoE) pipeline. Supports recursive directory analysis, automatic frontmatter updates, and comprehensive reporting.

Usage

# Classify current directory recursively (with automatic track mapping)
/classify . -r

# Classify a single file
/classify path/to/file.md

# Classify directory recursively
/classify docs/ -r

# Dry run (show what would happen without processing)
/classify docs/ -r --dry-run

# Update frontmatter with classifications
/classify docs/ -r --update-frontmatter

# Skip track mapping (default is enabled)
/classify docs/ -r --skip-track-mappings

# Save results to JSON file
/classify docs/ -r --output results.json

# Save results as CSV
/classify docs/ -r --output results.csv --format csv

# Limit processing (useful for testing)
/classify docs/ -r --limit 20

# Exclude patterns
/classify docs/ -r --exclude "test_*,*.draft.md"

# Verbose output with detailed logs
/classify docs/ -r -v

# Quiet mode (suppress progress)
/classify docs/ -r -q --output results.json

### Autonomous Mode

# Autonomous classification - iterates until 85%+ confidence
/classify docs/ -r --autonomous

# Autonomous with content signal injection
/classify docs/ -r --autonomous --fix

# Dry run autonomous mode (preview changes)
/classify docs/ -r --autonomous --fix --dry-run

### Expert Mode (NEW v2.1)

# Use Type Expert agents for deep semantic analysis
/classify docs/ -r --expert

# Expert mode with detailed reports for each file
/classify docs/ -r --expert --expert-report

# Expert mode with JSON output
/classify docs/ -r --expert -o expert-analysis.json

# Combine expert and autonomous modes
/classify docs/ -r --expert --autonomous --fix

### V4 Semantic Embeddings (NEW)

# Use semantic embeddings for improved accuracy on low-confidence files
# Requires: source .venv/bin/activate && pip install sentence-transformers
/classify docs/ -r --use-embeddings

# Default: no embeddings (faster, recommended for batch operations)
/classify docs/ -r --no-embeddings

### V3 Enhanced Modes

# Enhance frontmatter to force threshold confidence
/classify docs/ -r --enhance-frontmatter

# With custom threshold (default 95%)
/classify docs/ -r --enhance-frontmatter --threshold 90

# Use directory path as classification hint
/classify docs/workflows/ -r --respect-directory

# Force specific type for batch corrections
/classify docs/workflows/ -r --type-override workflow

# Get content enhancement suggestions
/classify docs/ -r --suggest-enhancements

# JSON output for CI/CD integration
/classify docs/ -r --json -o results.json

# Combine v3 modes for maximum accuracy
/classify docs/ -r --enhance-frontmatter --respect-directory --suggest-enhancements

Bi-Lateral TRACK ↔ SKILL Mapping (V5 Default)

The classifier now automatically maintains bi-lateral mappings between TRACKS and SKILLS for the CODITECT Experience Framework and Coditect Semantic Skill LLM Translator MCP.

What It Does

When classifying skills, the classifier automatically:

Scans all skills for cef_track frontmatter field
Builds track-to-skill mappings from skill declarations
Updates track files with associated_skills frontmatter
Generates track-skills index at internal/project/track-skills-index.md
Validates consistency between skill → track and track → skill mappings

Skill Frontmatter (Source of Truth)

Skills declare their track membership:

---
name: my-skill
component_type: skill
cef_track: G-1  # Track G, Subtrack 1
tags:
  - automation
---

Track Frontmatter (Auto-Updated)

Track files automatically list their associated skills:

---
track: G
track_name: DMS Product
associated_skills:
  - my-skill
  - another-skill
  - third-skill
track_mapping_updated: 2026-01-29
---

Track Mapping Output

After classification, you'll see:

📊 Track Mappings Updated:
   Tracks: 9 | Skills: 375
   Track files updated: 7
   ✅ Mappings consistent

Skip Track Mapping

To disable automatic track mapping (not recommended):

/classify skills/ -r --skip-track-mappings

Track-Skills Index

The generated index file contains:

Complete track-to-skill mappings
Skill-to-track reverse lookups
Statistics and validation results
Searchable markdown tables

Location: internal/project/track-skills-index.md

V3 Enhanced Frontmatter Mode

When --enhance-frontmatter flag is used, the classifier:

Runs full MoE analysis to determine document type
Checks confidence against threshold (default 95%)
Enhances low-confidence files by adding explicit type declarations
Forces threshold confidence via moe_confidence field

Key difference from --update-frontmatter:

--update-frontmatter: Records actual content confidence
--enhance-frontmatter: Forces threshold with explicit type declarations

V3 Results (coditect-core docs):

Before: 86.3% at ≥95% (81/95 files)
After:  100.0% at ≥95% (95/95 files)
Enhanced: 82 files with explicit type/component_type fields

Semantic Embeddings (`--use-embeddings`)

V4 Feature: Uses sentence-transformers to compute semantic similarity between documents and exemplar types, providing confidence boosts when embeddings agree with classification.

When to Use `--use-embeddings`

Scenario	Recommendation
Files with confidence < 60%	Use embeddings - may boost by 5-15%
New/unusual document types	Use embeddings - semantic analysis helps
Quality-critical classifications	Use embeddings - extra validation layer
Research/debugging	Use embeddings - understand classification reasoning
Batch processing (100+ files)	Skip embeddings - 2.6x slower
Well-structured docs with frontmatter	Skip embeddings - base MoE is accurate
CI/CD pipelines	Skip embeddings - speed matters
Development iteration	Skip embeddings - faster feedback

Requirements

# Embeddings require sentence-transformers in the venv
source .venv/bin/activate
pip install sentence-transformers

# Then run with embeddings
python scripts/moe_classifier/classify.py docs/ --use-embeddings

Performance Impact

Mode	Speed	Accuracy	Memory
Without Embeddings	55 files/sec	100%*	Low
With Embeddings	21 files/sec	100%*	+500MB

*On well-structured documents with proper frontmatter

How Embeddings Enhance Classification

Pre-classification hints: Embedding similarity suggests type before MoE runs
Confidence blending: When embedding agrees, confidence = (70% base + 30% embedding)
Only boosts, never reduces: Blending only applies if it improves confidence
Memory integration: Historical patterns from org.db (ADR-118 Tier 2) inform classification

Directory-Aware Classification (`--respect-directory`)

Uses directory path as a classification hint when MoE confidence is low:

Directory Pattern	Implied Type
`workflows/`	workflow
`guides/`, `getting-started/`, `training/`	guide
`reference/`, `architecture/`	reference
`adrs/`	adr
`agents/`	agent
`commands/`	command
`skills/`	skill
`hooks/`	hook

Only overrides when MoE confidence < threshold AND directory suggests different type.

Autonomous Mode

When --autonomous flag is used, the classifier:

Deep analyzes each document to understand true purpose
Fixes misclassified types in frontmatter
Injects content signals (with --fix) to boost confidence
Iterates until 95-100% confidence achieved
Amplifies signals progressively if no improvement
Forces full signal set at iteration 5 for guaranteed 100%
Verifies with MoE judge panel

Target: 100% confidence, 0% human review

Iteration Strategy

Iteration	Action	Expected Gain
1	Add missing required sections	+15-25%
2	Add type-specific content patterns	+10-15%
3	Enhance frontmatter metadata	+5-10%
4	Add cross-references and links	+3-5%
5	Force FULL signal set	→ 100%

Expert Mode (Phase 2)

When --expert flag is used, the classifier invokes Type Expert agents for deep semantic analysis:

Analyzes vote disagreements - Identifies which analysts disagree and why
Runs Type Expert analysis - Specialized agents for each document type (guide, reference, workflow, agent, command, adr, skill)
Generates targeted enhancements - Contextually-appropriate content improvements (not generic templates)
Provides reasoning audit trail - Full transparency on classification decisions

Type Expert Agents

Expert	Document Type	Key Signals
`GuideExpert`	Guides	Prerequisites, steps, troubleshooting
`ReferenceExpert`	References	API docs, schemas, configuration tables
`WorkflowExpert`	Workflows	Phases, Mermaid diagrams, checklists
`AgentExpert`	Agents	Persona ("You are..."), capabilities, tools
`CommandExpert`	Commands	/invocation, parameters, usage
`ADRExpert`	ADRs	Context, decision, consequences, status
`SkillExpert`	Skills	When to use, patterns, input/output specs

Expert Mode Flow

Expert Output

{
  "recommended_type": "reference",
  "confidence": 0.89,
  "reasoning": "Expert 'reference' strongly confirms (conf=0.85); Expert agrees with analyst majority",
  "enhancements": [
    {
      "signal_type": "api_reference",
      "reason": "Reference docs need API documentation",
      "priority": 1
    }
  ],
  "audit_trail": [
    "Starting coordination for: docs/ARCHITECTURE-OVERVIEW.md",
    "Vote analysis: reference (3/5)",
    "Expert 'reference' says is_this_type=True, conf=0.85",
    "Selected type: reference"
  ]
}

System Prompt

⚠️ EXECUTION DIRECTIVE: When the user invokes /classify, you MUST:

IMMEDIATELY execute the full MoE classifier
ALWAYS show full output with progress and results
ALWAYS provide summary with classification metrics

DO NOT:

Ask for confirmation - the user invoking the command IS the confirmation
Skip execution even if it seems redundant

You are executing the full MoE document classifier for CODITECT.

Execution Command (ALWAYS USE WRAPPER):

~/.coditect/scripts/classify.sh <path> [options]

The wrapper script (classify.sh) automatically:

Activates the Python virtual environment
Ensures sentence-transformers embeddings are available
Runs the classifier with all options

Alternative (manual venv activation):

source ~/.coditect/.venv/bin/activate && python3 ~/.coditect/scripts/moe_classifier/classify.py <path> [options]

Parse user arguments and map to CLI options:

User Says	CLI Option
`-r`, `--recursive`	`-r`
`--dry-run`	`--dry-run`
`--update-frontmatter`	`--update-frontmatter`
`--output FILE`	`-o FILE`
`--format json/csv/summary`	`-f FORMAT`
`--limit N`	`--limit N`
`--exclude PATTERNS`	`--exclude PATTERNS`
`-v`, `--verbose`	`-v`
`-q`, `--quiet`	`-q`

4-Phase MoE Pipeline

┌─────────────────────────────────────────────────────────────────┐
│                    MoE Classification Pipeline                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Phase 1: ANALYSTS (6 specialists)                              │
│  ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐  │
│  │Metadata │Structural│ Content │ Pattern │Semantic │  Path   │  │
│  │Analyst  │ Analyst  │ Analyst │ Analyst │ Analyst │ Analyst │  │
│  └────┬────┴────┬────┴────┬────┴────┬────┴────┬────┴────┬────┘  │
│       │         │         │         │         │         │        │
│       └─────────┴─────────┴────┬────┴─────────┴─────────┘        │
│                                │                                  │
│  Phase 2: CONSENSUS           ▼                                  │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │  Weighted voting, confidence calculation, type selection    │ │
│  └─────────────────────────────────────────────────────────────┘ │
│                                │                                  │
│                                ▼                                  │
│  Phase 3: JUDGES (if confidence < 85%)                          │
│  ┌─────────────┬─────────────┬─────────────┐                    │
│  │ Consistency │   Quality   │   Domain    │                    │
│  │    Judge    │    Judge    │    Judge    │                    │
│  └──────┬──────┴──────┬──────┴──────┬──────┘                    │
│         └─────────────┼─────────────┘                            │
│                       │                                          │
│  Phase 4: DEEP ANALYSIS (if judges disagree)                    │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │ SemanticSimilarity, Contextual, ContentReasoning, CrossRef  │ │
│  └─────────────────────────────────────────────────────────────┘ │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Classification Types

Type	Description	Key Signals
`agent`	AI agent definitions	`type: agent`, "You are a...", system prompts
`command`	Slash commands	`invocation: /xxx`, command patterns
`skill`	Reusable patterns	SKILL.md, `## When to Use This Skill`
`guide`	User tutorials	`## Step`, `## Prerequisites`, `## Troubleshooting`
`reference`	API/specs	Tables, `## API`, `## Configuration`
`workflow`	Process definitions	`## Phase`, sequenceDiagram, graph TD
`adr`	Architecture decisions	`## Context`, `## Decision`, `## Consequences`

Approval Types

Type	Confidence	Meaning
`AUTO_APPROVED`	≥ 85%	High confidence, no review needed
`JUDGE_APPROVED`	65-84%	Validated by consistency/quality/domain judges
`DEEP_ANALYSIS_APPROVED`	45-64%	Required semantic deep analysis
`HUMAN_REVIEW_REQUIRED`	< 45%	Low confidence, needs manual review

Examples

Analyze Entire Docs Directory

/classify docs/ -r

Output:

[██████████████████████████████████████████████████] 45/45 | Complete

======================================================================
Classification Summary
======================================================================

Processed: 45 files in 1.2s (37.5 files/sec)
Errors: 0

Approval Status:
  Auto-approved:     32 (71.1%)
  Judge-approved:     8 (17.8%)
  Deep analysis:      3 (6.7%)
  Human review:       2 (4.4%)

Classifications:
  guide          :    28 (62.2%)
  reference      :     8 (17.8%)
  agent          :     5 (11.1%)
  command        :     3 (6.7%)
  adr            :     1 (2.2%)

Update Frontmatter Automatically

/classify docs/ -r --update-frontmatter

Adds/updates frontmatter in each file:

---
type: guide
component_type: guide
# ... other frontmatter
---

Generate JSON Report

/classify docs/ -r --output classification-report.json --format json

Dry Run Preview

/classify docs/ -r --dry-run

Shows files that would be processed without making changes.

CLI Reference

Option	Description
`path`	File or directory to classify (required)
`-r, --recursive`	Process directories recursively
`-o, --output FILE`	Save results to file
`-f, --format FORMAT`	Output format: `json`, `csv`, `summary`
`--update-frontmatter`	Add/update type in file frontmatter
`--dry-run`	Preview without processing
`--limit N`	Limit files processed
`--exclude PATTERNS`	Comma-separated exclude patterns
`-v, --verbose`	Detailed logging
`-q, --quiet`	Suppress progress output
`--autonomous`	Iterate until 95%+ confidence achieved
`--fix`	Inject content signals (with --autonomous)
`--expert`	Use Type Expert agents for deep analysis
`--expert-report`	Generate detailed expert reports (with --expert)
`--enhance-frontmatter`	V3: Force type declarations on low-confidence files
`--threshold N`	Target confidence threshold (default 95)
`--type-override TYPE`	V3: Force specific type for all files
`--respect-directory`	V3: Use directory path as classification hint
`--suggest-enhancements`	V3: Analyze and suggest content improvements
`--json`	V3: JSON output for CI/CD integration
`--use-embeddings`	V4: Use semantic embeddings (requires venv + sentence-transformers)
`--no-embeddings`	V4: Disable embeddings (default, faster)

moe-content-classification skill - Skill patterns and usage
moe-content-classifier agent - Agent for deep classification
MoE Classifier Source - Core classifier implementation

Hook Integration (Known Bug)

PostToolUse auto-classification hook is configured but currently affected by Claude Code bug #15585. Manual /classify works fully.

Author: CODITECT Core Team Framework: CODITECT v1.7.2 Classifier: MoE v4.0 with Semantic Embeddings + 13 Type Experts (coditect-core)

Version History

Version	Date	Changes
v5.0.0	2026-01-29	Bi-lateral TRACK ↔ SKILL mapping (default), automatic track file updates, track-skills index generation
v4.0.0	2025-12-31	Semantic embeddings (`--use-embeddings`), venv detection, confidence blending
v3.0.0	2025-12-29	Enhanced frontmatter, respect-directory, type-override, suggest-enhancements
v2.1.0	2025-12-28	13 Type Experts, expert mode, autonomous mode
v2.0.0	2025-12-20	Full MoE pipeline with judges and deep analysis
v1.0.0	2025-12-15	Initial release with basic classification

Success Output

When classification completes successfully:

✅ COMMAND COMPLETE: /classify
Processed: N files in X.Xs
Auto-approved: M (Y%)
Confidence threshold: 95%+

Completion Checklist

Before marking complete:

All target files classified
Frontmatter updated (if --update-frontmatter)
Results summary displayed
No unhandled errors

Failure Indicators

This command has FAILED if:

❌ Python environment not activated
❌ Script not found at expected path
❌ Target path doesn't exist
❌ Permission denied on file writes

When NOT to Use

Do NOT use when:

Files are actively being edited (may conflict)
Running batch job on production docs without --dry-run first
No Python venv available (embeddings mode)

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Skip --dry-run	Unexpected changes	Always preview first
Force override all	Lose valid classifications	Use selectively
Ignore low confidence	Poor discoverability	Use --enhance-frontmatter

Principles

This command embodies:

#1 Recycle, Extend, Re-Use - Enhances existing documents
#3 Complete Execution - Full MoE pipeline
#9 Based on Facts - Data-driven classification

Full Standard: CODITECT-STANDARD-AUTOMATION.md

Usage​

Bi-Lateral TRACK ↔ SKILL Mapping (V5 Default)​

What It Does​

Skill Frontmatter (Source of Truth)​

Track Frontmatter (Auto-Updated)​

Track Mapping Output​

Skip Track Mapping​

Track-Skills Index​

V3 Enhanced Frontmatter Mode​

V3 Results (coditect-core docs):​

Semantic Embeddings (--use-embeddings)​

When to Use --use-embeddings​

Requirements​

Performance Impact​

How Embeddings Enhance Classification​

Directory-Aware Classification (--respect-directory)​

Autonomous Mode​

Iteration Strategy​

Expert Mode (Phase 2)​

Type Expert Agents​

Expert Mode Flow​

Expert Output​

System Prompt​

4-Phase MoE Pipeline​

Classification Types​

Approval Types​

Examples​

Analyze Entire Docs Directory​

Update Frontmatter Automatically​

Generate JSON Report​

Dry Run Preview​

CLI Reference​

Related​

Hook Integration (Known Bug)​

Version History​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​