/skill-from-docs

Purpose

Generate a Claude Code skill from a documentation website. Implements smart scraping, categorization, pattern extraction, and AI enhancement.

Syntax

/skill-from-docs <url> [--name <skill-name>] [--max-pages <n>] [--output <path>]

Parameters

Parameter	Type	Required	Default	Description
`url`	string	Yes	-	Documentation website URL
`--name`	string	No	derived	Skill name
`--max-pages`	integer	No	500	Maximum pages to scrape
`--output`	string	No	~/.coditect/skills/{name}/	Output directory
`--async`	flag	No	true	Enable async scraping
`--workers`	integer	No	8	Async worker count
`--estimate-only`	flag	No	false	Only estimate page count
`--skip-enhance`	flag	No	false	Skip AI enhancement
`--target`	string	No	claude	Target platform (claude/gemini/openai/markdown)
`--rate-limit`	float	No	0.5	Requests per second

Examples

Basic Usage

# Generate skill from React documentation
/skill-from-docs https://react.dev/

# With custom name
/skill-from-docs https://fastapi.tiangolo.com/ --name fastapi

Advanced Usage

# Estimate pages first
/skill-from-docs https://docs.python.org/ --estimate-only

# Large documentation with limits
/skill-from-docs https://godotengine.org/docs/ --max-pages 2000 --workers 8

# Custom output location
/skill-from-docs https://flask.palletsprojects.com/ --output ~/my-skills/flask/

# For Google Gemini
/skill-from-docs https://react.dev/ --target gemini

Workflow Examples

# 1. Estimate → Scrape → Enhance → Package
/skill-from-docs https://react.dev/ --estimate-only
# Output: ~180 pages estimated

/skill-from-docs https://react.dev/
# Output: SKILL.md + references/ created

# 2. Quick iteration (skip enhancement)
/skill-from-docs https://flask.palletsprojects.com/ --skip-enhance

# 3. Multi-platform generation
/skill-from-docs https://react.dev/ --target claude
/skill-from-docs https://react.dev/ --target gemini
/skill-from-docs https://react.dev/ --target openai

Output

~/.coditect/skills/{name}/
├── SKILL.md                    # AI-enhanced (300+ lines)
├── references/
│   ├── index.md                # Category index
│   ├── getting_started/        # Categorized content
│   ├── api_reference/
│   ├── tutorials/
│   └── patterns/
├── examples/
│   ├── basic/
│   └── advanced/
└── metadata.json
    {
      "source_url": "...",
      "pages_scraped": 180,
      "scrape_date": "2026-01-23",
      "quality_score": 8.7
    }

What Happens

URL Validation: Verify URL is accessible
Strategy Detection: Check for llms.txt, sitemap, or BFS
Page Scraping: Extract content with rate limiting
Categorization: Smart category assignment
Pattern Extraction: Find code patterns and examples
AI Enhancement: Generate comprehensive SKILL.md
Quality Check: Validate against CODITECT standards

Success Output

✅ COMMAND COMPLETE: /skill-from-docs

Scraping Summary:
- [x] URL: https://react.dev/
- [x] Strategy: llms.txt (10x faster)
- [x] Pages: 180/180 (100%)
- [x] Categories: 6
- [x] Patterns: 23

Output: ~/.coditect/skills/react/
- SKILL.md (342 lines, AI-enhanced)
- references/ (6 categories)
- examples/ (12 examples)

Quality Score: 8.7/10

When NOT to Use

Do NOT use this command when:

Documentation requires authentication (OAuth, API keys, login) - use manual download instead
Target site has aggressive anti-bot protection (Cloudflare, captchas) - will fail
Documentation is behind a paywall - not supported
Source is a video tutorial - not supported
Documentation is a single PDF file - use /pdf skill instead
Need to combine docs with code analysis - use skill-generator-orchestrator instead
Documentation is not in English - limited support
Quick one-off lookup - just ask Claude directly

Completion Checklist

Before marking this command as complete, verify:

URL validated and accessible
Scraping strategy determined (llms.txt, sitemap, or BFS)
All pages scraped within configured limits
Content categorized into correct directories
Code blocks extracted with language detection
Patterns identified and documented
SKILL.md generated (300+ lines if AI-enhanced)
Output directory structure matches specification
metadata.json created with scraping metrics

Failure Indicators

❌ URL not accessible (404, 403, timeout)
❌ robots.txt blocks scraping
❌ Zero pages scraped
❌ Rate limited by target site
❌ SKILL.md under 100 lines
❌ Categorization confidence below 50%
❌ Output directory not created

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
No rate limiting	IP blocked, incomplete scrape	Use `--rate-limit 0.5` minimum
Ignoring robots.txt	Legal issues, site blocks	Command respects robots.txt by default
Scraping auth pages	Empty content	Use manual download + PDF analysis
Too many workers	Server overload, blocks	Limit `--workers 8` maximum
No max-pages limit	Hours of scraping	Set reasonable `--max-pages`
Skipping estimation	Unexpected large sites	Run `--estimate-only` first
Skipping AI enhancement	Generic output	Only use `--skip-enhance` for iteration

Verification

After execution, verify success:

# 1. Check output directory exists
ls -la ~/.coditect/skills/{name}/

# 2. Verify SKILL.md length (should be 200+ lines)
wc -l ~/.coditect/skills/{name}/SKILL.md

# 3. Check categorization
ls -la ~/.coditect/skills/{name}/references/

# 4. Validate metadata
cat ~/.coditect/skills/{name}/metadata.json | python3 -m json.tool

# 5. Count extracted code blocks
grep -r '```' ~/.coditect/skills/{name}/references/ | wc -l

Agent: doc-to-skill-converter
Orchestrator: skill-generator-orchestrator
Companion: /skill-from-repo
Skill: multi-source-skill-generation

Purpose​

Syntax​

Parameters​

Examples​

Basic Usage​

Advanced Usage​

Workflow Examples​

Output​

What Happens​

Success Output​

When NOT to Use​

Completion Checklist​

Failure Indicators​

Anti-Patterns (Avoid)​

Verification​

Related​