/skill-from-docs
Purpose
Generate a Claude Code skill from a documentation website. Implements smart scraping, categorization, pattern extraction, and AI enhancement.
Syntax
/skill-from-docs <url> [--name <skill-name>] [--max-pages <n>] [--output <path>]
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | Yes | - | Documentation website URL |
--name | string | No | derived | Skill name |
--max-pages | integer | No | 500 | Maximum pages to scrape |
--output | string | No | ~/.coditect/skills/{name}/ | Output directory |
--async | flag | No | true | Enable async scraping |
--workers | integer | No | 8 | Async worker count |
--estimate-only | flag | No | false | Only estimate page count |
--skip-enhance | flag | No | false | Skip AI enhancement |
--target | string | No | claude | Target platform (claude/gemini/openai/markdown) |
--rate-limit | float | No | 0.5 | Requests per second |
Examples
Basic Usage
# Generate skill from React documentation
/skill-from-docs https://react.dev/
# With custom name
/skill-from-docs https://fastapi.tiangolo.com/ --name fastapi
Advanced Usage
# Estimate pages first
/skill-from-docs https://docs.python.org/ --estimate-only
# Large documentation with limits
/skill-from-docs https://godotengine.org/docs/ --max-pages 2000 --workers 8
# Custom output location
/skill-from-docs https://flask.palletsprojects.com/ --output ~/my-skills/flask/
# For Google Gemini
/skill-from-docs https://react.dev/ --target gemini
Workflow Examples
# 1. Estimate → Scrape → Enhance → Package
/skill-from-docs https://react.dev/ --estimate-only
# Output: ~180 pages estimated
/skill-from-docs https://react.dev/
# Output: SKILL.md + references/ created
# 2. Quick iteration (skip enhancement)
/skill-from-docs https://flask.palletsprojects.com/ --skip-enhance
# 3. Multi-platform generation
/skill-from-docs https://react.dev/ --target claude
/skill-from-docs https://react.dev/ --target gemini
/skill-from-docs https://react.dev/ --target openai
Output
~/.coditect/skills/{name}/
├── SKILL.md # AI-enhanced (300+ lines)
├── references/
│ ├── index.md # Category index
│ ├── getting_started/ # Categorized content
│ ├── api_reference/
│ ├── tutorials/
│ └── patterns/
├── examples/
│ ├── basic/
│ └── advanced/
└── metadata.json
{
"source_url": "...",
"pages_scraped": 180,
"scrape_date": "2026-01-23",
"quality_score": 8.7
}
What Happens
- URL Validation: Verify URL is accessible
- Strategy Detection: Check for llms.txt, sitemap, or BFS
- Page Scraping: Extract content with rate limiting
- Categorization: Smart category assignment
- Pattern Extraction: Find code patterns and examples
- AI Enhancement: Generate comprehensive SKILL.md
- Quality Check: Validate against CODITECT standards
Success Output
✅ COMMAND COMPLETE: /skill-from-docs
Scraping Summary:
- [x] URL: https://react.dev/
- [x] Strategy: llms.txt (10x faster)
- [x] Pages: 180/180 (100%)
- [x] Categories: 6
- [x] Patterns: 23
Output: ~/.coditect/skills/react/
- SKILL.md (342 lines, AI-enhanced)
- references/ (6 categories)
- examples/ (12 examples)
Quality Score: 8.7/10
When NOT to Use
Do NOT use this command when:
- Documentation requires authentication (OAuth, API keys, login) - use manual download instead
- Target site has aggressive anti-bot protection (Cloudflare, captchas) - will fail
- Documentation is behind a paywall - not supported
- Source is a video tutorial - not supported
- Documentation is a single PDF file - use
/pdfskill instead - Need to combine docs with code analysis - use
skill-generator-orchestratorinstead - Documentation is not in English - limited support
- Quick one-off lookup - just ask Claude directly
Completion Checklist
Before marking this command as complete, verify:
- URL validated and accessible
- Scraping strategy determined (llms.txt, sitemap, or BFS)
- All pages scraped within configured limits
- Content categorized into correct directories
- Code blocks extracted with language detection
- Patterns identified and documented
- SKILL.md generated (300+ lines if AI-enhanced)
- Output directory structure matches specification
- metadata.json created with scraping metrics
Failure Indicators
- ❌ URL not accessible (404, 403, timeout)
- ❌ robots.txt blocks scraping
- ❌ Zero pages scraped
- ❌ Rate limited by target site
- ❌ SKILL.md under 100 lines
- ❌ Categorization confidence below 50%
- ❌ Output directory not created
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| No rate limiting | IP blocked, incomplete scrape | Use --rate-limit 0.5 minimum |
| Ignoring robots.txt | Legal issues, site blocks | Command respects robots.txt by default |
| Scraping auth pages | Empty content | Use manual download + PDF analysis |
| Too many workers | Server overload, blocks | Limit --workers 8 maximum |
| No max-pages limit | Hours of scraping | Set reasonable --max-pages |
| Skipping estimation | Unexpected large sites | Run --estimate-only first |
| Skipping AI enhancement | Generic output | Only use --skip-enhance for iteration |
Verification
After execution, verify success:
# 1. Check output directory exists
ls -la ~/.coditect/skills/{name}/
# 2. Verify SKILL.md length (should be 200+ lines)
wc -l ~/.coditect/skills/{name}/SKILL.md
# 3. Check categorization
ls -la ~/.coditect/skills/{name}/references/
# 4. Validate metadata
cat ~/.coditect/skills/{name}/metadata.json | python3 -m json.tool
# 5. Count extracted code blocks
grep -r '```' ~/.coditect/skills/{name}/references/ | wc -l
Related
- Agent:
doc-to-skill-converter - Orchestrator:
skill-generator-orchestrator - Companion:
/skill-from-repo - Skill:
multi-source-skill-generation