How to Create a New Skill: Step-by-Step Guide

Time Required: 45-90 minutes Difficulty: Intermediate to Advanced Prerequisites: Understanding of progressive disclosure, basic YAML knowledge Output: Production-ready skill with opinionated best practice workflow Authority: Anthropic Skill Authoring Best Practices (January 2026)

Overview

This guide walks you through creating a production-ready skill following CODITECT standards and Anthropic's progressive disclosure architecture. Skills bundle instructions, scripts, and resources that Claude loads dynamically.

What is a Skill? (Anthropic Definition)

A skill is an opinionated best practice workflow for a recurring activity with specific examples, self-validation steps, and actionable guidance.

Essential Characteristics:

Characteristic	Description
Opinionated	Prescribes THE way to do something, not multiple options
Best Practice	Encodes proven patterns and expertise
Workflow	Step-by-step process with checkpoints
Recurring Activity	Solves problems encountered repeatedly
Specific Examples	Concrete input/output pairs
Self-Validating	Built-in verification and quality gates

What You'll Build: A complete skill with:

✅ YAML frontmatter (MANDATORY per Anthropic spec)
✅ 3-level progressive disclosure (metadata, instructions, resources)
✅ Under 500 lines for SKILL.md body (Anthropic Jan 2026)
✅ Opinionated workflow with quality gates
✅ Specific examples with expected output
✅ Self-validation checklists
✅ Integration with agents and commands
✅ Executable scripts (optional but powerful)

Key Insight: Skills use progressive disclosure to bundle UNLIMITED content with near-zero token cost for unused resources.

Step 1: Define Skill Purpose (10 minutes)

1.1 What Problem Does This Solve?

Write one sentence describing the core capability:

Example: "Extract structured data from PDFs including text, tables, and images while preserving layout"

1.2 When Should Someone Use This Skill?

List 3-5 specific scenarios:

Scenario 1: Extracting financial data from PDF reports
Scenario 2: Converting research papers to markdown
Scenario 3: Parsing invoices for accounting automation
Scenario 4: Migrating legacy PDF documentation to structured formats

1.3 What Makes This Reusable?

Skills should be:

Generalizable: Works across multiple use cases
Self-contained: Doesn't depend on project-specific code
Well-documented: Instructions clear enough for Claude to follow
Testable: Can be validated with examples

1.4 Estimate Content Size

Level 2 (Instructions):

Keep under 5000 tokens (~3000-4000 words)
If larger, split into Level 3 resources

Level 3 (Resources):

No limit - only loaded when referenced
Scripts run via Bash (only output uses tokens)

Step 2: Create Skill Directory (5 minutes)

2.1 Name Selection

Naming Convention (Anthropic Jan 2026):

Use gerund form (verb + -ing) for skill names. This clearly describes the activity or capability the skill provides.

Format: {verb-ing}-{object} (preferred) or {object}-{verb-ing}

Rules:

Lowercase with hyphens (kebab-case)
Maximum 64 characters
Cannot contain "anthropic" or "claude"
Descriptive and searchable
Avoid generic terms

Preferred (Gerund Form):

processing-pdfs
migrating-databases
reviewing-code
generating-documentation
testing-apis

Acceptable Alternatives:

Noun phrases: pdf-processing, database-migration
Action-oriented: process-pdfs, migrate-databases

Avoid:

pdfSkill (camelCase)
pdf_extract (underscore)
skill1 (too generic)
helper, utils, tools (too vague)
anthropic-pdf-tool (reserved word)
the-super-advanced-pdf-extraction-skill (too long)

2.2 Create Directory Structure

cd /path/to/coditect-core
mkdir -p .coditect/skills/pdf-extraction

2.3 Decide on Progressive Disclosure Levels

Minimal (2 levels):

pdf-extraction/
└── SKILL.md          # Metadata + Instructions

Standard (3 levels):

pdf-extraction/
├── SKILL.md          # Metadata + Instructions
├── FORMS.md          # Templates (Level 3)
└── REFERENCE.md      # Detailed docs (Level 3)

Advanced (3 levels + scripts):

pdf-extraction/
├── SKILL.md          # Metadata + Instructions
├── FORMS.md          # Templates
├── REFERENCE.md      # API reference
└── scripts/          # Executable utilities
    ├── extract_text.py
    └── parse_tables.py

Step 3: Write YAML Frontmatter (CRITICAL - 10 minutes)

3.1 MANDATORY YAML Format

⚠️ CRITICAL: Skills MUST use YAML frontmatter, NOT Markdown headers.

WRONG (Will NOT work):

# PDF Extraction Skill

**Skill Name:** pdf-extraction
**Description:** Extracts data from PDFs

CORRECT (Required format):

---
name: pdf-extraction
description: Extract structured data from PDF files including text, tables, and images while preserving layout and formatting
---

3.2 Required Fields

---
name: extracting-pdf-data
description: Extracts structured data from PDF files including text, tables, and images while preserving layout and formatting. Use when processing PDF documents, converting PDFs to other formats, or when the user mentions PDF extraction.
---

Field Specifications:

Field	Required	Max Length	Validation
`name`	YES	64 chars	Lowercase, hyphens only, no "anthropic" or "claude"
`description`	YES	1024 chars	Non-empty, no XML tags, THIRD PERSON, highly descriptive

3.3 Description Writing (Anthropic Jan 2026)

⚠️ CRITICAL: Always write in THIRD PERSON.

The description is injected into the system prompt. Inconsistent point-of-view causes discovery problems.

✅ Good (Third Person)	❌ Avoid
"Extracts text from PDFs"	"I can help you extract text"
"Processes financial reports"	"You can use this to process reports"
"Generates unit tests"	"This skill helps with testing"

Include BOTH what the skill does AND when to use it:

description: Extracts text and tables from PDF files, fills forms, merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.

Why good:

Third person throughout ("Extracts", not "I extract")
Lists specific capabilities: "text", "tables", "forms"
Includes trigger phrases: "when working with PDF files", "when the user mentions"
Claude can select from 100+ skills based on this description

Poor Description:

description: Helps with PDF files

Why poor:

Too vague
No specific capabilities
No trigger phrases
Claude won't know when to select this skill

Step 4: Write Instructions (Level 2 - 20 minutes)

4.1 Structure Template

After YAML frontmatter, organize instructions like this:

---
name: pdf-extraction
description: [...]
---

# PDF Extraction

## Purpose

This skill provides comprehensive PDF data extraction capabilities for converting PDF documents into structured formats. Use this when you need to extract text, parse tables, retrieve images, or analyze PDF metadata.

## When to Use

**Use this skill for:**
- Extracting text from PDF reports or documents
- Parsing financial tables from PDF statements
- Retrieving images from PDF presentations
- Converting PDF documentation to markdown
- Analyzing PDF metadata and structure

**Do NOT use for:**
- Creating PDFs (use `pdf-generation` skill instead)
- Editing existing PDFs (use `pdf-editor` skill)
- OCR on scanned PDFs (use `ocr-processing` skill first)

## Prerequisites

- PDF file accessible on filesystem
- Python 3.8+ with pypdf2 library (for script-based extraction)
- Read permission on PDF files

## Instructions

### Step 1: Identify PDF Type

Determine the PDF structure to choose extraction method:

**Text-based PDF:**
- Searchable text
- Copy/paste works
- Use: Direct text extraction

**Scanned PDF:**
- Images of text
- Not searchable
- Use: OCR preprocessing required (see `ocr-processing` skill)

**Hybrid PDF:**
- Mix of text and images
- Use: Combined extraction approach

### Step 2: Extract Text Content

**Method 1: Direct Text Extraction (Recommended)**

```python
# Use scripts/extract_text.py
python3 scripts/extract_text.py input.pdf output.txt

Method 2: Manual Extraction

# Read PDF content directly
cat input.pdf | strings | grep -v "^%" > output.txt

Output Format:

Plain text with preserved paragraphs
Section headers identified
Metadata preserved in comments

Step 3: Parse Tables

Detection:

Identify table boundaries by alignment patterns
Detect column separators
Extract headers

Extraction:

# Use scripts/parse_tables.py
python3 scripts/parse_tables.py input.pdf --format csv

Output Formats:

CSV: table_1.csv, table_2.csv, ...
JSON: Structured data with headers
Markdown: Pipe-formatted tables

Step 4: Extract Images

Process:

Scan PDF for embedded images
Extract image data
Save to numbered files

# Use scripts/extract_images.py
python3 scripts/extract_images.py input.pdf --output-dir images/

Output:

images/
├── page_1_image_1.png
├── page_1_image_2.jpg
├── page_2_image_1.png
└── ...

Step 5: Metadata Extraction

Extract PDF metadata for analysis:

# Use scripts/get_metadata.py
python3 scripts/get_metadata.py input.pdf

Metadata Includes:

Title, Author, Subject
Creation date, Modification date
PDF version, Page count
Security settings

Output Formats

Text Output

# Extracted from: document.pdf
# Pages: 1-50
# Date: 2025-12-03

[Document text here with preserved formatting]

Table Output (CSV)

Header1,Header2,Header3
Value1,Value2,Value3
Value4,Value5,Value6

Image Inventory

{
  "images": [
    {
      "page": 1,
      "index": 1,
      "filename": "page_1_image_1.png",
      "format": "PNG",
      "size": "1920x1080"
    }
  ]
}

Quality Validation

After extraction, validate results:

Text Extraction:

All pages processed
Formatting preserved
No garbage characters
Headers/footers identified

Table Extraction:

All tables detected
Headers correctly identified
Data rows complete
Alignment preserved

Image Extraction:

All images extracted
Original quality maintained
File formats correct
Naming consistent

Troubleshooting

Issue: Garbled Text

Cause: Encoding mismatch or font embedding issues Solution:

Check PDF encoding: file -I input.pdf
Try alternative extraction: Use pdftotext command
Fallback to OCR if needed

Issue: Tables Not Detected

Cause: Non-standard table formatting Solution:

Manually identify table regions
Use visual boundary detection
Consider manual extraction for complex tables

Issue: Missing Images

Cause: Inline images or compressed format Solution:

Use pdfimages utility as alternative
Check PDF security settings
Verify extraction permissions

Advanced Usage

For complex extraction scenarios, see:

FORMS.md - Template structures for common PDF formats
REFERENCE.md - Complete API documentation for extraction scripts
scripts/README.md - Script usage and customization

Integration

Agents:

document-processor - Orchestrates multi-document extraction
data-pipeline-engineer - Integrates extracted data into pipelines

Commands:

/extract-pdf - Interactive PDF extraction
/parse-document - General document parsing

Skills:

ocr-processing - Preprocess scanned PDFs
data-cleaning - Clean extracted text data
table-normalization - Standardize extracted tables

### 4.2 Line and Token Budget (Anthropic Jan 2026)

**Line Limit:** Under **500 lines** for SKILL.md body (Anthropic recommendation)
**Token Target:** Under 5000 tokens (~3000-4000 words)

**Check limits:**
```bash
# Line count
wc -l skills/processing-pdfs/SKILL.md

# Token count
python3 scripts/count-tokens.py skills/processing-pdfs/SKILL.md

If over limit:

Move examples to EXAMPLES.md
Move API details to REFERENCE.md
Move templates to FORMS.md
Keep only essential workflow in SKILL.md

4.3 Self-Validation (REQUIRED)

Every skill MUST include quality gates and validation checklists.

Skills are self-validating, meaning they include built-in verification that Claude can use to confirm work is complete.

Workflow Pattern with Quality Gates:

### Phase 1: Preparation

**Steps:**
1. Validate input file exists
2. Check file format is supported

**Quality Gate:**

Phase 1 Complete:

Input file accessible
Format supported
No permission errors

### Phase 2: Processing

**Steps:**
1. Execute main operation
2. Capture results

**Quality Gate:**

Phase 2 Complete:

Operation completed without errors
Results captured
No data loss

### Phase 3: Output

**Validation Checklist:**

Final Verification:

All expected outputs generated
Output format correct
Quality criteria met

Feedback Loop Pattern:

## Feedback Loop

**Execute** → Run the operation
**Validate** → Check against quality gate
**Fix** → Address any failures
**Repeat** → Until validation passes
**Proceed** → Only move forward when gates pass

Step 5: Create Level 3 Resources (Optional - 30 minutes)

5.1 FORMS.md - Templates and Schemas

# PDF Extraction Forms and Templates

## Template 1: Financial Report Structure

**Use Case:** Extracting data from standard financial reports

**Expected Structure:**
```json
{
  "report_type": "quarterly_earnings",
  "company": "string",
  "period": "Q1 2025",
  "sections": {
    "executive_summary": "text",
    "financial_statements": "tables",
    "footnotes": "text"
  }
}

Extraction Pattern:

Identify section headers
Extract tables from "Financial Statements" section
Preserve footnotes with reference markers

Template 2: Invoice Processing

Use Case: Automated invoice data extraction

Required Fields:

Invoice number
Date
Vendor info
Line items (table)
Total amount

Extraction Workflow:

# Use template-based extraction
python3 scripts/extract_invoice.py input.pdf --template invoice_standard

[Additional templates...]

### 5.2 REFERENCE.md - Detailed API Documentation

```markdown
# PDF Extraction Scripts Reference

## extract_text.py

### Synopsis

python3 scripts/extract_text.py [OPTIONS] INPUT_PDF OUTPUT_FILE

### Options

| Option | Description | Default |
|--------|-------------|---------|
| `--pages START-END` | Page range to extract | All pages |
| `--encoding CODEC` | Output text encoding | UTF-8 |
| `--preserve-layout` | Maintain spatial layout | False |
| `--include-metadata` | Add metadata header | True |

### Examples

**Extract all pages:**
```bash
python3 scripts/extract_text.py document.pdf output.txt

Extract specific pages:

python3 scripts/extract_text.py --pages 10-20 document.pdf output.txt

Preserve layout:

python3 scripts/extract_text.py --preserve-layout report.pdf formatted.txt

Output Format

# Extracted from: document.pdf
# Pages: 1-50
# Extraction Date: 2025-12-03 14:30:00
# Encoding: UTF-8

[Page 1]

[Document content here...]

[Page 2]

[Content continues...]

[Complete API documentation for all scripts...]

### 5.3 Scripts Directory

**scripts/extract_text.py:**
```python
#!/usr/bin/env python3
"""
PDF Text Extraction Script

Extracts text content from PDF files with layout preservation options.
Part of the pdf-extraction skill.
"""

import sys
import argparse
from PyPDF2 import PdfReader
from datetime import datetime

def extract_text(pdf_path, start_page=1, end_page=None, preserve_layout=False):
    """Extract text from PDF file."""
    reader = PdfReader(pdf_path)
    total_pages = len(reader.pages)

    if end_page is None:
        end_page = total_pages

    extracted_text = []
    extracted_text.append(f"# Extracted from: {pdf_path}")
    extracted_text.append(f"# Pages: {start_page}-{end_page}")
    extracted_text.append(f"# Extraction Date: {datetime.now()}")
    extracted_text.append("")

    for page_num in range(start_page - 1, min(end_page, total_pages)):
        page = reader.pages[page_num]
        text = page.extract_text()

        extracted_text.append(f"[Page {page_num + 1}]")
        extracted_text.append("")
        extracted_text.append(text)
        extracted_text.append("")

    return "\n".join(extracted_text)

def main():
    parser = argparse.ArgumentParser(description="Extract text from PDF files")
    parser.add_argument("input_pdf", help="Input PDF file")
    parser.add_argument("output_file", help="Output text file")
    parser.add_argument("--pages", help="Page range (e.g., 1-10)", default=None)
    parser.add_argument("--preserve-layout", action="store_true", help="Preserve layout")

    args = parser.parse_args()

    # Parse page range if provided
    start_page, end_page = 1, None
    if args.pages:
        parts = args.pages.split("-")
        start_page = int(parts[0])
        end_page = int(parts[1]) if len(parts) > 1 else None

    # Extract text
    text = extract_text(args.input_pdf, start_page, end_page, args.preserve_layout)

    # Write output
    with open(args.output_file, "w", encoding="utf-8") as f:
        f.write(text)

    print(f"✅ Text extracted to {args.output_file}")
    print(f"📄 Pages processed: {start_page}-{end_page or 'end'}")

if __name__ == "__main__":
    main()

Make executable:

chmod +x scripts/extract_text.py

Step 6: Validate Skill (10 minutes)

6.1 Automated Validation

python3 .coditect/scripts/validate-skill.py skills/pdf-extraction/

Validation Checks:

YAML frontmatter present
Required fields: name, description
Name matches directory
Description under 1024 characters
SKILL.md under 5000 tokens (recommended)
Scripts have shebang and execute permissions
No XML tags in description

6.2 Manual Quality Checks

Progressive Disclosure:

Level 1 (metadata) optimized for discovery
Level 2 (instructions) clear and complete
Level 3 (resources) properly referenced
Unused resources don't consume tokens

Usability:

Instructions can be followed without additional context
Examples are concrete and realistic
Prerequisites clearly stated
Integration points documented

Token Efficiency:

SKILL.md body under 5000 tokens
Extended content in Level 3 files
Scripts minimize output verbosity

6.3 Test Invocation

Via Agent:

Task(subagent_type="document-processor",
     prompt="Use the pdf-extraction skill to extract text from report.pdf")

Via Command:

/extract-pdf report.pdf --format text

Direct Skill Reference:

# In custom agent or prompt
"Refer to the pdf-extraction skill for detailed extraction instructions"

Step 7: Document Integration (10 minutes)

7.1 Create README.md (Optional)

# PDF Extraction Skill

## Quick Start

```bash
# Extract text
python3 scripts/extract_text.py input.pdf output.txt

# Parse tables
python3 scripts/parse_tables.py input.pdf --format csv

# Extract images
python3 scripts/extract_images.py input.pdf --output-dir images/

Files

SKILL.md - Main skill definition with extraction workflows
FORMS.md - Templates for common PDF formats
REFERENCE.md - Complete API documentation
scripts/ - Executable extraction utilities

Integration

Agents: document-processor, data-pipeline-engineer Commands: /extract-pdf, /parse-document Skills: ocr-processing, data-cleaning, table-normalization

Examples

See FORMS.md for template-based extraction examples.

### 7.2 Update Skill Index

Add your skill to `.coditect/skills/README.md`:

```markdown
### Document Processing

- **pdf-extraction** - Extract text, tables, and images from PDF files
  - Level 3 resources: FORMS.md, REFERENCE.md, extraction scripts
  - Use when: Processing PDF documents for data extraction

Step 8: Commit to Repository (5 minutes)

8.1 Git Add

cd /path/to/coditect-core
git add .coditect/skills/pdf-extraction/

8.2 Conventional Commit

git commit -m "feat(skills): Add pdf-extraction skill with progressive disclosure

- Text, table, and image extraction from PDFs
- 3-level progressive disclosure (metadata, instructions, resources)
- Executable Python scripts for automation
- Templates for financial reports and invoices
- Complete API reference documentation
- Token-optimized design (<5K tokens Level 2)

Related: #456"

8.3 Push and PR

git push origin feature/pdf-extraction-skill

PR Description:

## Summary
Production-ready PDF extraction skill with 3-level progressive disclosure.

## Skill Details
- **Name:** pdf-extraction
- **Levels:** 3 (metadata, instructions, resources)
- **Token Cost:** ~800 tokens (Level 1+2), unlimited Level 3
- **Scripts:** 4 executable utilities
- **Documentation:** FORMS.md, REFERENCE.md

## Progressive Disclosure Breakdown
- **Level 1** (Always loaded): 50 tokens - name + description
- **Level 2** (Triggered): 750 tokens - SKILL.md instructions
- **Level 3** (On-demand): ~0 tokens until accessed - FORMS, REFERENCE, scripts

## Quality Score
- [x] YAML frontmatter validated
- [x] Token budget under 5000 (Level 2: 750 tokens)
- [x] Scripts executable and tested
- [x] Integration documented
- [x] Grade A compliance

## Testing
- [x] Extract text from 50-page PDF
- [x] Parse tables from financial report
- [x] Extract images from presentation
- [x] Metadata extraction verified

Step 9: Activate Skill (Optional)

9.1 Decide on Activation

Activate immediately if:

Skill is tested and validated
Team needs it for current work
Dependencies are available

Wait if:

Skill is experimental
Scripts need testing in production environment
Team wants review period

9.2 Activation Command

python3 .coditect/scripts/update-component-activation.py activate skill pdf-extraction \
  --reason "Production-ready PDF extraction with progressive disclosure and automation scripts"

9.3 Commit Activation

git add .coditect/component-activation-status.json
git commit -m "chore: Activate pdf-extraction skill"
git push origin main

Common Mistakes and Fixes

Mistake 1: Using Markdown Headers Instead of YAML

Wrong:

# PDF Extraction

**Name:** pdf-extraction
**Description:** Extracts from PDFs

Correct:

---
name: pdf-extraction
description: Extract structured data from PDF files including text, tables, and images
---

# PDF Extraction

Mistake 2: Description Too Generic

Wrong:

description: PDF skill

Correct:

description: Extract structured data from PDF files including text, tables, and images while preserving layout and formatting. Supports text extraction, table parsing, image extraction, and metadata retrieval.

Mistake 3: Exceeding Token Budget

Problem: SKILL.md is 8000 tokens

Fix:

Count tokens: python3 scripts/count-tokens.py skills/pdf-extraction/SKILL.md
Move extended examples to EXAMPLES.md
Move API details to REFERENCE.md
Keep only core workflow in SKILL.md

Mistake 4: Scripts Without Shebang

Wrong:

import sys
# No shebang line

Correct:

#!/usr/bin/env python3
import sys

Mistake 5: Not Making Scripts Executable

Problem: Permission denied when running scripts

Fix:

chmod +x scripts/*.py

Best Practices Summary (Anthropic Jan 2026)

Do:

✅ Use YAML frontmatter (MANDATORY)
✅ Use gerund naming (processing-pdfs not pdf-processor)
✅ Write descriptions in THIRD PERSON
✅ Include trigger phrases ("Use when...")
✅ Keep SKILL.md under 500 lines
✅ Keep SKILL.md under 5000 tokens
✅ Include self-validation checklists
✅ Be opinionated (prescribe THE way, not options)
✅ Provide specific examples with input/output
✅ Use Level 3 for detailed docs and scripts
✅ Make scripts executable
✅ Document integration points
✅ Test skill invocation
✅ Build evaluations BEFORE extensive documentation

Don't:

❌ Use Markdown headers instead of YAML
❌ Write vague or first-person descriptions
❌ Put everything in SKILL.md (line/token waste)
❌ Offer multiple approaches (be opinionated!)
❌ Nest references more than one level deep
❌ Forget shebang lines in scripts
❌ Skip quality gates and validation
❌ Create project-specific skills
❌ Exceed 1024 char description limit
❌ Use reserved words ("anthropic", "claude") in name
❌ Use XML tags in description

Quick Reference Checklist

File Structure:

Directory: .coditect/skills/{skill-name}/
SKILL.md with YAML frontmatter
Level 3 resources (if needed): FORMS.md, REFERENCE.md, scripts/

YAML Frontmatter:

Progressive Disclosure:

Level 1: Optimized metadata (name, description)
Level 2: SKILL.md instructions under 5000 tokens
Level 3: Additional resources loaded on-demand

Scripts (if present):

Shebang line (#!/usr/bin/env python3)
Execute permissions (chmod +x)
Error handling and exit codes
Documentation in comments

Quality:

Automated validation passed
Token count under budget
Tested invocation via agent/command
Integration documented

Next Steps

After creating your skill:

Create companion components:
- Agent: See HOW-TO-CREATE-NEW-AGENT.md
- Command: See HOW-TO-CREATE-NEW-COMMAND.md
Integrate into workflows:
- Update workflow documentation
- Create example use cases
- Train team on capabilities
Monitor and iterate:
- Track usage patterns
- Gather feedback
- Optimize based on real-world use

Need Help?

Document Version: 2.0.0 Last Updated: January 23, 2026 Maintainer: CODITECT Core Team Authority: Anthropic Skill Authoring Best Practices

Overview​

What is a Skill? (Anthropic Definition)​

Step 1: Define Skill Purpose (10 minutes)​

1.1 What Problem Does This Solve?​

1.2 When Should Someone Use This Skill?​

1.3 What Makes This Reusable?​

1.4 Estimate Content Size​

Step 2: Create Skill Directory (5 minutes)​

2.1 Name Selection​

2.2 Create Directory Structure​

2.3 Decide on Progressive Disclosure Levels​

Step 3: Write YAML Frontmatter (CRITICAL - 10 minutes)​

3.1 MANDATORY YAML Format​

3.2 Required Fields​

3.3 Description Writing (Anthropic Jan 2026)​

Step 4: Write Instructions (Level 2 - 20 minutes)​

4.1 Structure Template​

Step 3: Parse Tables​

Step 4: Extract Images​

Step 5: Metadata Extraction​

Output Formats​

Text Output​

Table Output (CSV)​

Image Inventory​

Quality Validation​

Troubleshooting​

Issue: Garbled Text​

Issue: Tables Not Detected​

Issue: Missing Images​

Advanced Usage​

Integration​

4.3 Self-Validation (REQUIRED)​

Step 5: Create Level 3 Resources (Optional - 30 minutes)​

5.1 FORMS.md - Templates and Schemas​

Template 2: Invoice Processing​

Output Format​

Step 6: Validate Skill (10 minutes)​

6.1 Automated Validation​

6.2 Manual Quality Checks​

6.3 Test Invocation​

Step 7: Document Integration (10 minutes)​

7.1 Create README.md (Optional)​

Files​

Integration​

Examples​

Step 8: Commit to Repository (5 minutes)​

8.1 Git Add​

8.2 Conventional Commit​

8.3 Push and PR​

Step 9: Activate Skill (Optional)​

9.1 Decide on Activation​

9.2 Activation Command​

9.3 Commit Activation​

Common Mistakes and Fixes​

Mistake 1: Using Markdown Headers Instead of YAML​

Mistake 2: Description Too Generic​

Mistake 3: Exceeding Token Budget​

Mistake 4: Scripts Without Shebang​

Mistake 5: Not Making Scripts Executable​

Best Practices Summary (Anthropic Jan 2026)​

Quick Reference Checklist​

Next Steps​

Overview

What is a Skill? (Anthropic Definition)

Step 1: Define Skill Purpose (10 minutes)

1.1 What Problem Does This Solve?

1.2 When Should Someone Use This Skill?

1.3 What Makes This Reusable?

1.4 Estimate Content Size

Step 2: Create Skill Directory (5 minutes)

2.1 Name Selection

2.2 Create Directory Structure

2.3 Decide on Progressive Disclosure Levels

Step 3: Write YAML Frontmatter (CRITICAL - 10 minutes)

3.1 MANDATORY YAML Format

3.2 Required Fields

3.3 Description Writing (Anthropic Jan 2026)

Step 4: Write Instructions (Level 2 - 20 minutes)

4.1 Structure Template

Step 3: Parse Tables

Step 4: Extract Images

Step 5: Metadata Extraction

Output Formats

Text Output

Table Output (CSV)

Image Inventory

Quality Validation

Troubleshooting

Issue: Garbled Text

Issue: Tables Not Detected

Issue: Missing Images

Advanced Usage

Integration

4.3 Self-Validation (REQUIRED)

Step 5: Create Level 3 Resources (Optional - 30 minutes)

5.1 FORMS.md - Templates and Schemas

Template 2: Invoice Processing

Output Format

Step 6: Validate Skill (10 minutes)

6.1 Automated Validation

6.2 Manual Quality Checks

6.3 Test Invocation

Step 7: Document Integration (10 minutes)

7.1 Create README.md (Optional)

Files

Integration

Examples

Step 8: Commit to Repository (5 minutes)

8.1 Git Add

8.2 Conventional Commit

8.3 Push and PR

Step 9: Activate Skill (Optional)

9.1 Decide on Activation

9.2 Activation Command

9.3 Commit Activation

Common Mistakes and Fixes

Mistake 1: Using Markdown Headers Instead of YAML

Mistake 2: Description Too Generic

Mistake 3: Exceeding Token Budget

Mistake 4: Scripts Without Shebang

Mistake 5: Not Making Scripts Executable

Best Practices Summary (Anthropic Jan 2026)

Quick Reference Checklist

Next Steps