Skip to main content

/classify - Full MoE Document Classification

Classify documents using the complete 4-phase Mixture of Experts (MoE) pipeline. Supports recursive directory analysis, automatic frontmatter updates, and comprehensive reporting.

Usage

# Classify current directory recursively (with automatic track mapping)
/classify . -r

# Classify a single file
/classify path/to/file.md

# Classify directory recursively
/classify docs/ -r

# Dry run (show what would happen without processing)
/classify docs/ -r --dry-run

# Update frontmatter with classifications
/classify docs/ -r --update-frontmatter

# Skip track mapping (default is enabled)
/classify docs/ -r --skip-track-mappings

# Save results to JSON file
/classify docs/ -r --output results.json

# Save results as CSV
/classify docs/ -r --output results.csv --format csv

# Limit processing (useful for testing)
/classify docs/ -r --limit 20

# Exclude patterns
/classify docs/ -r --exclude "test_*,*.draft.md"

# Verbose output with detailed logs
/classify docs/ -r -v

# Quiet mode (suppress progress)
/classify docs/ -r -q --output results.json

### Autonomous Mode

# Autonomous classification - iterates until 85%+ confidence
/classify docs/ -r --autonomous

# Autonomous with content signal injection
/classify docs/ -r --autonomous --fix

# Dry run autonomous mode (preview changes)
/classify docs/ -r --autonomous --fix --dry-run

### Expert Mode (NEW v2.1)

# Use Type Expert agents for deep semantic analysis
/classify docs/ -r --expert

# Expert mode with detailed reports for each file
/classify docs/ -r --expert --expert-report

# Expert mode with JSON output
/classify docs/ -r --expert -o expert-analysis.json

# Combine expert and autonomous modes
/classify docs/ -r --expert --autonomous --fix

### V4 Semantic Embeddings (NEW)

# Use semantic embeddings for improved accuracy on low-confidence files
# Requires: source .venv/bin/activate && pip install sentence-transformers
/classify docs/ -r --use-embeddings

# Default: no embeddings (faster, recommended for batch operations)
/classify docs/ -r --no-embeddings

### V3 Enhanced Modes

# Enhance frontmatter to force threshold confidence
/classify docs/ -r --enhance-frontmatter

# With custom threshold (default 95%)
/classify docs/ -r --enhance-frontmatter --threshold 90

# Use directory path as classification hint
/classify docs/workflows/ -r --respect-directory

# Force specific type for batch corrections
/classify docs/workflows/ -r --type-override workflow

# Get content enhancement suggestions
/classify docs/ -r --suggest-enhancements

# JSON output for CI/CD integration
/classify docs/ -r --json -o results.json

# Combine v3 modes for maximum accuracy
/classify docs/ -r --enhance-frontmatter --respect-directory --suggest-enhancements

Bi-Lateral TRACK ↔ SKILL Mapping (V5 Default)

The classifier now automatically maintains bi-lateral mappings between TRACKS and SKILLS for the CODITECT Experience Framework and Coditect Semantic Skill LLM Translator MCP.

What It Does

When classifying skills, the classifier automatically:

  1. Scans all skills for cef_track frontmatter field
  2. Builds track-to-skill mappings from skill declarations
  3. Updates track files with associated_skills frontmatter
  4. Generates track-skills index at internal/project/track-skills-index.md
  5. Validates consistency between skill → track and track → skill mappings

Skill Frontmatter (Source of Truth)

Skills declare their track membership:

---
name: my-skill
component_type: skill
cef_track: G-1 # Track G, Subtrack 1
tags:
- automation
---

Track Frontmatter (Auto-Updated)

Track files automatically list their associated skills:

---
track: G
track_name: DMS Product
associated_skills:
- my-skill
- another-skill
- third-skill
track_mapping_updated: 2026-01-29
---

Track Mapping Output

After classification, you'll see:

📊 Track Mappings Updated:
Tracks: 9 | Skills: 375
Track files updated: 7
✅ Mappings consistent

Skip Track Mapping

To disable automatic track mapping (not recommended):

/classify skills/ -r --skip-track-mappings

Track-Skills Index

The generated index file contains:

  • Complete track-to-skill mappings
  • Skill-to-track reverse lookups
  • Statistics and validation results
  • Searchable markdown tables

Location: internal/project/track-skills-index.md


V3 Enhanced Frontmatter Mode

When --enhance-frontmatter flag is used, the classifier:

  1. Runs full MoE analysis to determine document type
  2. Checks confidence against threshold (default 95%)
  3. Enhances low-confidence files by adding explicit type declarations
  4. Forces threshold confidence via moe_confidence field

Key difference from --update-frontmatter:

  • --update-frontmatter: Records actual content confidence
  • --enhance-frontmatter: Forces threshold with explicit type declarations

V3 Results (coditect-core docs):

Before: 86.3% at ≥95% (81/95 files)
After: 100.0% at ≥95% (95/95 files)
Enhanced: 82 files with explicit type/component_type fields

Semantic Embeddings (--use-embeddings)

V4 Feature: Uses sentence-transformers to compute semantic similarity between documents and exemplar types, providing confidence boosts when embeddings agree with classification.

When to Use --use-embeddings

ScenarioRecommendation
Files with confidence < 60%Use embeddings - may boost by 5-15%
New/unusual document typesUse embeddings - semantic analysis helps
Quality-critical classificationsUse embeddings - extra validation layer
Research/debuggingUse embeddings - understand classification reasoning
Batch processing (100+ files)Skip embeddings - 2.6x slower
Well-structured docs with frontmatterSkip embeddings - base MoE is accurate
CI/CD pipelinesSkip embeddings - speed matters
Development iterationSkip embeddings - faster feedback

Requirements

# Embeddings require sentence-transformers in the venv
source .venv/bin/activate
pip install sentence-transformers

# Then run with embeddings
python scripts/moe_classifier/classify.py docs/ --use-embeddings

Performance Impact

ModeSpeedAccuracyMemory
Without Embeddings55 files/sec100%*Low
With Embeddings21 files/sec100%*+500MB

*On well-structured documents with proper frontmatter

How Embeddings Enhance Classification

  1. Pre-classification hints: Embedding similarity suggests type before MoE runs
  2. Confidence blending: When embedding agrees, confidence = (70% base + 30% embedding)
  3. Only boosts, never reduces: Blending only applies if it improves confidence
  4. Memory integration: Historical patterns from org.db (ADR-118 Tier 2) inform classification

Directory-Aware Classification (--respect-directory)

Uses directory path as a classification hint when MoE confidence is low:

Directory PatternImplied Type
workflows/workflow
guides/, getting-started/, training/guide
reference/, architecture/reference
adrs/adr
agents/agent
commands/command
skills/skill
hooks/hook

Only overrides when MoE confidence < threshold AND directory suggests different type.

Autonomous Mode

When --autonomous flag is used, the classifier:

  1. Deep analyzes each document to understand true purpose
  2. Fixes misclassified types in frontmatter
  3. Injects content signals (with --fix) to boost confidence
  4. Iterates until 95-100% confidence achieved
  5. Amplifies signals progressively if no improvement
  6. Forces full signal set at iteration 5 for guaranteed 100%
  7. Verifies with MoE judge panel

Target: 100% confidence, 0% human review

Iteration Strategy

IterationActionExpected Gain
1Add missing required sections+15-25%
2Add type-specific content patterns+10-15%
3Enhance frontmatter metadata+5-10%
4Add cross-references and links+3-5%
5Force FULL signal set→ 100%

Expert Mode (Phase 2)

When --expert flag is used, the classifier invokes Type Expert agents for deep semantic analysis:

  1. Analyzes vote disagreements - Identifies which analysts disagree and why
  2. Runs Type Expert analysis - Specialized agents for each document type (guide, reference, workflow, agent, command, adr, skill)
  3. Generates targeted enhancements - Contextually-appropriate content improvements (not generic templates)
  4. Provides reasoning audit trail - Full transparency on classification decisions

Type Expert Agents

ExpertDocument TypeKey Signals
GuideExpertGuidesPrerequisites, steps, troubleshooting
ReferenceExpertReferencesAPI docs, schemas, configuration tables
WorkflowExpertWorkflowsPhases, Mermaid diagrams, checklists
AgentExpertAgentsPersona ("You are..."), capabilities, tools
CommandExpertCommands/invocation, parameters, usage
ADRExpertADRsContext, decision, consequences, status
SkillExpertSkillsWhen to use, patterns, input/output specs

Expert Mode Flow

Expert Output

{
"recommended_type": "reference",
"confidence": 0.89,
"reasoning": "Expert 'reference' strongly confirms (conf=0.85); Expert agrees with analyst majority",
"enhancements": [
{
"signal_type": "api_reference",
"reason": "Reference docs need API documentation",
"priority": 1
}
],
"audit_trail": [
"Starting coordination for: docs/ARCHITECTURE-OVERVIEW.md",
"Vote analysis: reference (3/5)",
"Expert 'reference' says is_this_type=True, conf=0.85",
"Selected type: reference"
]
}

System Prompt

⚠️ EXECUTION DIRECTIVE: When the user invokes /classify, you MUST:

  1. IMMEDIATELY execute the full MoE classifier
  2. ALWAYS show full output with progress and results
  3. ALWAYS provide summary with classification metrics

DO NOT:

  • Ask for confirmation - the user invoking the command IS the confirmation
  • Skip execution even if it seems redundant

You are executing the full MoE document classifier for CODITECT.

Execution Command (ALWAYS USE WRAPPER):

~/.coditect/scripts/classify.sh <path> [options]

The wrapper script (classify.sh) automatically:

  1. Activates the Python virtual environment
  2. Ensures sentence-transformers embeddings are available
  3. Runs the classifier with all options

Alternative (manual venv activation):

source ~/.coditect/.venv/bin/activate && python3 ~/.coditect/scripts/moe_classifier/classify.py <path> [options]

Parse user arguments and map to CLI options:

User SaysCLI Option
-r, --recursive-r
--dry-run--dry-run
--update-frontmatter--update-frontmatter
--output FILE-o FILE
--format json/csv/summary-f FORMAT
--limit N--limit N
--exclude PATTERNS--exclude PATTERNS
-v, --verbose-v
-q, --quiet-q

4-Phase MoE Pipeline

┌─────────────────────────────────────────────────────────────────┐
│ MoE Classification Pipeline │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Phase 1: ANALYSTS (6 specialists) │
│ ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐ │
│ │Metadata │Structural│ Content │ Pattern │Semantic │ Path │ │
│ │Analyst │ Analyst │ Analyst │ Analyst │ Analyst │ Analyst │ │
│ └────┬────┴────┬────┴────┬────┴────┬────┴────┬────┴────┬────┘ │
│ │ │ │ │ │ │ │
│ └─────────┴─────────┴────┬────┴─────────┴─────────┘ │
│ │ │
│ Phase 2: CONSENSUS ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Weighted voting, confidence calculation, type selection │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Phase 3: JUDGES (if confidence < 85%) │
│ ┌─────────────┬─────────────┬─────────────┐ │
│ │ Consistency │ Quality │ Domain │ │
│ │ Judge │ Judge │ Judge │ │
│ └──────┬──────┴──────┬──────┴──────┬──────┘ │
│ └─────────────┼─────────────┘ │
│ │ │
│ Phase 4: DEEP ANALYSIS (if judges disagree) │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ SemanticSimilarity, Contextual, ContentReasoning, CrossRef │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

Classification Types

TypeDescriptionKey Signals
agentAI agent definitionstype: agent, "You are a...", system prompts
commandSlash commandsinvocation: /xxx, command patterns
skillReusable patternsSKILL.md, ## When to Use This Skill
guideUser tutorials## Step, ## Prerequisites, ## Troubleshooting
referenceAPI/specsTables, ## API, ## Configuration
workflowProcess definitions## Phase, sequenceDiagram, graph TD
adrArchitecture decisions## Context, ## Decision, ## Consequences

Approval Types

TypeConfidenceMeaning
AUTO_APPROVED≥ 85%High confidence, no review needed
JUDGE_APPROVED65-84%Validated by consistency/quality/domain judges
DEEP_ANALYSIS_APPROVED45-64%Required semantic deep analysis
HUMAN_REVIEW_REQUIRED< 45%Low confidence, needs manual review

Examples

Analyze Entire Docs Directory

/classify docs/ -r

Output:

[██████████████████████████████████████████████████] 45/45 | Complete

======================================================================
Classification Summary
======================================================================

Processed: 45 files in 1.2s (37.5 files/sec)
Errors: 0

Approval Status:
Auto-approved: 32 (71.1%)
Judge-approved: 8 (17.8%)
Deep analysis: 3 (6.7%)
Human review: 2 (4.4%)

Classifications:
guide : 28 (62.2%)
reference : 8 (17.8%)
agent : 5 (11.1%)
command : 3 (6.7%)
adr : 1 (2.2%)

Update Frontmatter Automatically

/classify docs/ -r --update-frontmatter

Adds/updates frontmatter in each file:

---
type: guide
component_type: guide
# ... other frontmatter
---

Generate JSON Report

/classify docs/ -r --output classification-report.json --format json

Dry Run Preview

/classify docs/ -r --dry-run

Shows files that would be processed without making changes.

CLI Reference

OptionDescription
pathFile or directory to classify (required)
-r, --recursiveProcess directories recursively
-o, --output FILESave results to file
-f, --format FORMATOutput format: json, csv, summary
--update-frontmatterAdd/update type in file frontmatter
--dry-runPreview without processing
--limit NLimit files processed
--exclude PATTERNSComma-separated exclude patterns
-v, --verboseDetailed logging
-q, --quietSuppress progress output
--autonomousIterate until 95%+ confidence achieved
--fixInject content signals (with --autonomous)
--expertUse Type Expert agents for deep analysis
--expert-reportGenerate detailed expert reports (with --expert)
--enhance-frontmatterV3: Force type declarations on low-confidence files
--threshold NTarget confidence threshold (default 95)
--type-override TYPEV3: Force specific type for all files
--respect-directoryV3: Use directory path as classification hint
--suggest-enhancementsV3: Analyze and suggest content improvements
--jsonV3: JSON output for CI/CD integration
--use-embeddingsV4: Use semantic embeddings (requires venv + sentence-transformers)
--no-embeddingsV4: Disable embeddings (default, faster)

Hook Integration (Known Bug)

PostToolUse auto-classification hook is configured but currently affected by Claude Code bug #15585. Manual /classify works fully.


Author: CODITECT Core Team Framework: CODITECT v1.7.2 Classifier: MoE v4.0 with Semantic Embeddings + 13 Type Experts (coditect-core)

Version History

VersionDateChanges
v5.0.02026-01-29Bi-lateral TRACK ↔ SKILL mapping (default), automatic track file updates, track-skills index generation
v4.0.02025-12-31Semantic embeddings (--use-embeddings), venv detection, confidence blending
v3.0.02025-12-29Enhanced frontmatter, respect-directory, type-override, suggest-enhancements
v2.1.02025-12-2813 Type Experts, expert mode, autonomous mode
v2.0.02025-12-20Full MoE pipeline with judges and deep analysis
v1.0.02025-12-15Initial release with basic classification

Success Output

When classification completes successfully:

✅ COMMAND COMPLETE: /classify
Processed: N files in X.Xs
Auto-approved: M (Y%)
Confidence threshold: 95%+

Completion Checklist

Before marking complete:

  • All target files classified
  • Frontmatter updated (if --update-frontmatter)
  • Results summary displayed
  • No unhandled errors

Failure Indicators

This command has FAILED if:

  • ❌ Python environment not activated
  • ❌ Script not found at expected path
  • ❌ Target path doesn't exist
  • ❌ Permission denied on file writes

When NOT to Use

Do NOT use when:

  • Files are actively being edited (may conflict)
  • Running batch job on production docs without --dry-run first
  • No Python venv available (embeddings mode)

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Skip --dry-runUnexpected changesAlways preview first
Force override allLose valid classificationsUse selectively
Ignore low confidencePoor discoverabilityUse --enhance-frontmatter

Principles

This command embodies:

  • #1 Recycle, Extend, Re-Use - Enhances existing documents
  • #3 Complete Execution - Full MoE pipeline
  • #9 Based on Facts - Data-driven classification

Full Standard: CODITECT-STANDARD-AUTOMATION.md