Document Classification Patterns

Pattern definitions, signal weights, and frontmatter templates for the CODITECT MoE document classification system.

When to Use This Skill

Use this skill when:

Classifying new documents into CODITECT taxonomy
Creating new documents with proper frontmatter
Understanding classification signals and weights
Debugging classification results

Classification Categories

Per ADR-018 and ADR-023, CODITECT documents are classified into these categories:

Type	Description	Key Signals
`agent`	AI agent definitions	`type: agent`, "You are a...", system prompts
`command`	Slash commands	`invocation: /xxx`, command usage patterns
`skill`	Reusable patterns	SKILL.md, `## When to Use This Skill`
`guide`	User tutorials	Step-by-step, troubleshooting sections
`reference`	API/architecture docs	Tables, specifications, configuration
`workflow`	Process definitions	Phases, sequenceDiagram, automation steps
`adr`	Architecture decisions	Context/Decision/Consequences sections

Signal Patterns

Agent Signals

Frontmatter (weight: 0.6):

type: agent
component_type: agent

Content (weight: 0.3):

"You are a" / "You are an"
"## Capabilities"
"## When to Use"

Path (weight: 0.2):

agents/
/agents/

Title (weight: 0.15):

*-specialist.md
*-expert.md
*-agent.md
*-orchestrator.md

Command Signals

Frontmatter (weight: 0.6):

type: command
component_type: command
invocation: /command-name

Content (weight: 0.3):

"invocation:"
"## Usage"
"## System Prompt"

Path (weight: 0.2):

commands/
/commands/

Skill Signals

Frontmatter (weight: 0.6):

type: skill
component_type: skill

Content (weight: 0.3):

"## When to Use This Skill"
"## Pattern"
SKILL.md filename

Path (weight: 0.2):

skills/
/skills/

Guide Signals

Frontmatter (weight: 0.6):

type: guide
component_type: guide

Content (weight: 0.3):

"## Step 1" / "## Step"
"## Prerequisites"
"## Troubleshooting"
"## Quick Start"

Path (weight: 0.2):

guides/
/guides/
docs/

Title (weight: 0.15):

*-guide.md
*-tutorial.md
*-quickstart.md

Reference Signals

Frontmatter (weight: 0.6):

type: reference
component_type: reference

Content (weight: 0.3):

"## API"
"## Configuration"
"| Parameter |"
"| Field |"

Path (weight: 0.2):

reference/
/reference/
docs/reference/

Title (weight: 0.15):

*-reference.md
*-api.md
*-spec.md

Workflow Signals

Frontmatter (weight: 0.6):

type: workflow
component_type: workflow

Content (weight: 0.3):

"## Phase"
"## Workflow"
"sequenceDiagram"
"graph TD"

Path (weight: 0.2):

workflows/
/workflows/

Title (weight: 0.15):

*-workflow.md
*-pipeline.md

ADR Signals

Frontmatter (weight: 0.6):

type: adr
component_type: adr
doc_type: adr
adr_number: 23

Content (weight: 0.3):

"## Context"
"## Decision"
"## Consequences"
"Status: Accepted"

Path (weight: 0.2):

adrs/
/adrs/
architecture/

Title (weight: 0.15):

ADR-*.md
adr-*.md

Confidence Thresholds

Confidence	Approval Type	Action
≥ 0.85	AUTO_APPROVED	Classified automatically
0.65 - 0.84	JUDGE_APPROVED	Validated by consistency judge
0.45 - 0.64	DEEP_ANALYSIS_APPROVED	Required deep semantic analysis
< 0.45	HUMAN_REVIEW_REQUIRED	Flagged for manual review

Frontmatter Templates

Agent Template

---
title: Agent Name Specialist
type: agent
component_type: agent
version: 1.0.0
audience: contributor
status: active
summary: Brief description of agent purpose
keywords:
- domain
- specialization
tokens: ~1500
created: 'YYYY-MM-DD'
updated: 'YYYY-MM-DD'
tags:
- agent
- domain
---

Command Template

---
title: /command-name - Brief Description
component_type: command
version: 1.0.0
invocation: /command-name [args]
audience: customer
status: active
summary: What this command does
keywords:
- automation
tokens: ~1200
created: 'YYYY-MM-DD'
updated: 'YYYY-MM-DD'
command_name: /command-name
aliases: []
usage: /command-name [options]
requires_confirmation: false
modifies_files: true
network_access: false
type: command
tags:
- command
---

Guide Template

---
title: Topic Name Guide
type: guide
component_type: guide
version: 1.0.0
audience: customer
status: active
summary: Brief description of what this guide covers
keywords:
- topic
- tutorial
tokens: ~2000
created: 'YYYY-MM-DD'
updated: 'YYYY-MM-DD'
tags:
- guide
- tutorial
---

ADR Template

---
title: 'ADR-XXX: Decision Title'
component_type: adr
type: adr
version: 1.0.0
audience: contributor
status: accepted
summary: Brief summary of the decision
keywords:
- architecture
- decision
tokens: ~2500
created: 'YYYY-MM-DD'
updated: 'YYYY-MM-DD'
doc_type: adr
adr_number: XXX
deciders:
- Name
categories:
- architecture
supersedes: []
superseded_by: null
when_to_read: When working on related topics
prerequisites: []
related_adrs: []
tags:
- architecture
- adr
---

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: document-classification-patterns

Completed:
- [x] Document type classified: {type}
- [x] Confidence score: {score}
- [x] Approval type: {AUTO_APPROVED|JUDGE_APPROVED|DEEP_ANALYSIS_APPROVED|HUMAN_REVIEW_REQUIRED}
- [x] Frontmatter updated with classification metadata
- [x] Signal weights applied: frontmatter (0.6), content (0.3), path (0.2), title (0.15)

Outputs:
- Document: {file_path}
- Type: {agent|command|skill|guide|reference|workflow|adr}
- Confidence: {0.0-1.0}
- Approval: {approval_type}
- Keywords: {auto_generated_keywords}
- Tags: {auto_generated_tags}

Completion Checklist

Before marking this skill as complete, verify:

Document type classified into one of 7 categories (agent, command, skill, guide, reference, workflow, adr)
Confidence score calculated (0.0-1.0) based on weighted signals
Approval type determined based on confidence threshold
Frontmatter added/updated with type, component_type, moe_confidence, moe_classified
Keywords extracted and added to frontmatter
Tags generated based on document type and content
Classification date stamped (YYYY-MM-DD format)
Signal patterns matched (frontmatter, content, path, title)
Low confidence (<0.45) flagged for human review

Failure Indicators

This skill has FAILED if:

❌ Confidence score is NaN or out of range (0.0-1.0)
❌ Document type is None or invalid (not in 7 categories)
❌ Frontmatter update fails or corrupts existing metadata
❌ Signal detection returns no matches (all weights = 0)
❌ Classification date format invalid
❌ Approval type logic produces incorrect threshold assignment
❌ Multiple equally strong type signals create ambiguity without resolution
❌ File path or content encoding issues prevent analysis

When NOT to Use

Do NOT use this skill when:

Document already has valid frontmatter with type and high moe_confidence (>0.85)
File is not a Markdown document (.md extension)
Document is auto-generated build artifact (no persistent classification needed)
File is template or example documentation (not actual component)
Binary files or non-text content (images, PDFs, executables)
Temporary scratch files or session logs
External documentation outside CODITECT framework

Use alternatives:

Manual frontmatter: For templates and examples
Skip classification: For build artifacts and temporary files
External tool: For non-Markdown documentation formats
Batch reclassification: When standards change requiring re-analysis

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Overwriting existing high-confidence classification	Loss of validated metadata	Check existing `moe_confidence` before reclassifying
Relying solely on path signals	Misclassifies moved/reorganized files	Use multi-signal weighted approach
Ignoring frontmatter conflicts	Inconsistent metadata	Validate against existing `type` and `component_type`
Auto-approving low confidence	Poor quality classifications	Enforce threshold gates (≥0.85 for AUTO_APPROVED)
Missing content signal patterns	Incomplete classification	Scan for all 3+ signal types per category
Generic keyword extraction	Low-value metadata	Extract domain-specific keywords from content
No reclassification on content change	Stale classifications	Track content hash and retrigger on change

Principles

This skill embodies the following CODITECT principles:

#5 Eliminate Ambiguity - Multi-signal weighted classification with confidence scoring
#6 Clear, Understandable, Explainable - Transparent signal weights and threshold logic
#8 No Assumptions - Explicit confidence scoring triggers human review for low confidence
Trust & Transparency - Classification metadata fully exposed in frontmatter
Factual Grounding - Signal detection based on concrete patterns (keywords, paths, structure)
MoE Architecture - Confidence thresholds route to appropriate approval paths (auto/judge/deep/human)

Version: 1.1.0 | Created: 2025-12-28 | Updated: 2026-01-04 Author: CODITECT Core Team Framework: CODITECT v1.7.2

When to Use This Skill​

Classification Categories​

Signal Patterns​

Agent Signals​

Command Signals​

Skill Signals​

Guide Signals​

Reference Signals​

Workflow Signals​

ADR Signals​

Confidence Thresholds​

Frontmatter Templates​

Agent Template​

Command Template​

Guide Template​

ADR Template​

Related Resources​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​

When to Use This Skill

Classification Categories

Signal Patterns

Agent Signals

Command Signals

Skill Signals

Guide Signals

Reference Signals

Workflow Signals

ADR Signals

Confidence Thresholds

Frontmatter Templates

Agent Template

Command Template

Guide Template

ADR Template

Related Resources

Success Output

Completion Checklist

Failure Indicators

When NOT to Use

Anti-Patterns (Avoid)

Principles