Skip to main content

CODITECT Documentation Standards v1.0.0

Executive Summary

Purpose: Define comprehensive documentation standards for CODITECT projects ensuring clarity, consistency, maintainability, and AI-readability across all technical documentation.

Scope: All markdown files, code comments, API documentation, and inline documentation in CODITECT projects.

Enforcement: Required for all production code. Automated validation via linters and documentation-quality-agent.

Quality Target: 95/100 minimum score for all documentation files.


Table of Contents

  1. Core Principles
  2. Markdown Standards
  3. Documentation Structure
  4. File Naming Conventions
  5. YAML Frontmatter
  6. Agentic Documentation Patterns ← NEW
  7. Content Guidelines
  8. Code Documentation
  9. Progressive Disclosure
  10. Validation & Quality
  11. Automated Maintenance

Core Principles

1. Clarity Over Brevity

Write for humans first, AI second. Clear, well-structured documentation benefits both human developers and AI agents.

❌ BAD: "Use fn to proc data"
✅ GOOD: "Use the processData() function to transform raw input into structured output"
```text

### 2. Progressive Disclosure

**Essential information first, details on demand.** Organize documentation so readers can:

- Get started in <5 minutes (Quick Start)
- Understand architecture in <15 minutes (Overview)
- Deep dive as needed (Comprehensive Guides)

### 3. Single Source of Truth

**Avoid duplication.** Link to authoritative documentation instead of copying content.

```markdown
❌ BAD: Copying the same installation instructions into 5 different files
✅ GOOD: Single installation guide, linked from all relevant documents
```text

### 4. Consistency Breeds Clarity

**Use consistent terminology, structure, and formatting.** Predictable documentation is easier to navigate.

### 5. Maintainability

**Documentation that isn't maintained becomes misleading.** Automated validation ensures docs stay current with code.

---

## Markdown Standards

### Line Length

**Maximum:** 120 characters per line
**Rationale:** Readable in split-screen editors, GitHub diffs, and terminal viewers

```markdown
<!-- ❌ BAD: Line too long -->
This is an excessively long line that exceeds 120 characters and will trigger linting errors because it's harder to read in split-screen mode and doesn't wrap nicely in code reviews.

<!-- ✅ GOOD: Properly wrapped -->
This is a properly wrapped paragraph that stays within 120 characters per line.
It's easier to read in split-screen editors and displays nicely in GitHub diffs
and terminal viewers.
```text

### Headings

**ATX-style only** (use `#` not `===`)

```markdown
❌ BAD:
Heading 1
=========

❌ BAD (closed ATX):

## Heading 2 ##

✅ GOOD:

## Heading 1

## Heading 2

### Heading 3

```text

## Hierarchy

- Use `#` for document title (only one H1 per file)
- Use `##` for major sections
- Use `###` for subsections
- Avoid going deeper than `####` (restructure if needed)

## Spacing

- 1 blank line before heading
- 1 blank line after heading

```markdown
✅ GOOD:
Previous content here.

## New Section

Content starts here.
```text

### Lists

**Unordered lists:** Use `-` (not `*` or `+`)
**Ordered lists:** Use `1.` for all items (auto-numbering)
**Indentation:** 2 spaces for nested lists

```markdown
✅ GOOD (unordered):
- Item 1
- Item 2
- Nested item 2.1
- Nested item 2.2
- Item 3

✅ GOOD (ordered):
1. First step
2. Second step
3. Third step
- Sub-step A
- Sub-step B
```text

### Code Blocks

**Always specify language** for syntax highlighting

```markdown
❌ BAD:
```text

function hello() {
console.log("world");
}

```text

✅ GOOD:
```javascript
function hello() {
console.log("world");
}
```text

```text

## Supported languages

- `bash`, `sh`, `shell` - Shell scripts
- `python`, `py` - Python code
- `javascript`, `js` - JavaScript
- `typescript`, `ts` - TypeScript
- `json` - JSON data
- `yaml`, `yml` - YAML configuration
- `markdown`, `md` - Markdown examples
- `diff` - Git diffs

### Links

**Prefer relative links** for internal documentation

```markdown
❌ BAD (absolute):
[Guide](https://github.com/org/repo/blob/main/docs/guide.md)

✅ GOOD (relative):
[Guide](../docs/guide.md)
```text

**Descriptive link text** (not "click here")

```markdown
❌ BAD:
For more information, [click here](guide.md).

✅ GOOD:
See the [Installation Guide](guide.md) for detailed setup instructions.
```text

### Tables

**Align columns** for readability in source

```markdown
✅ GOOD:
| Column 1 | Column 2 | Column 3 |
| -------------- | ------------ | ---------- |
| Short | Medium | Long |
| Data | Values | Here |
```text

**Maximum columns:** 6 (use nested lists if more complex)

### Emphasis

- **Bold** for emphasis and UI elements: `**bold**`
- *Italic* for terms and variables: `*italic*`
- `Code` for inline code and paths: `` `code` ``

```markdown
✅ GOOD:
The **Settings** panel contains the *apiKey* configuration. Set it to `your-api-key`.
```text

---

## Documentation Structure

### Required Files

Every CODITECT project MUST have:

1. **README.md** - Project overview and quick start
2. **Claude.md** - AI agent context and instructions
3. **LICENSE** - Legal terms
4. **.gitignore** - Version control exclusions

### Recommended Structure

```text
project/
├── README.md # Project overview
├── CLAUDE.md # AI agent context
├── LICENSE # Legal
├── docs/ # Documentation directory
│ ├── 01-getting-started/ # Onboarding
│ │ ├── QUICK-START.md
│ │ └── INSTALLATION.md
│ ├── 02-user-guides/ # How-to guides
│ ├── 02-architecture/ # System design
│ ├── 03-api-reference/ # API docs
│ ├── 04-project-planning/ # Project management
│ ├── 05-deployment/ # Deployment guides
│ └── INDEX.md # Documentation index
├── CODITECT-CORE-STANDARDS/ # Standards directory
│ ├── README.md
│ ├── CODITECT-STANDARD-*.md
│ └── TEMPLATES/
└── CHANGELOG.md # Version history
```text

### Documentation Categories

**01-getting-started/** - New user onboarding

- Quick Start (<10 minutes to first success)
- Installation guides
- Basic tutorials

**02-user-guides/** - Task-oriented guides

- How to accomplish specific tasks
- Best practices
- Troubleshooting

**02-architecture/** - System design

- Architecture overview
- Design decisions (ADRs)
- Technical specifications

**03-api-reference/** - API documentation

- Endpoint documentation
- Parameters and responses
- Code examples

**04-project-planning/** - Project management

- Project plans
- Task lists
- Milestones and timelines

**05-deployment/** - Production deployment

- Deployment guides
- CI/CD configuration
- Operations runbooks

---

## File Naming Conventions

### Markdown Files

**Format:** `SCREAMING-KEBAB-CASE.md` for standards and major docs
**Format:** `kebab-case.md` for guides and tutorials

```markdown
✅ GOOD (standards):
CODITECT-STANDARD-DOCUMENTATION.md
PROJECT-PLAN-CORE-RESTRUCTURING.md
TASKLIST-WITH-CHECKBOXES.md

✅ GOOD (guides):
quick-start.md
installation-guide.md
api-reference.md
```text

## Prefix conventions

- `CODITECT-STANDARD-` - Framework standards
- `PROJECT-PLAN-` - Project planning documents
- `TASKLIST-` - Task tracking documents
- `HOW-TO-` - Procedural guides
- `ADR-` - Architecture Decision Records

### Version Suffixes

**Include version in filename** for versioned standards

```markdown
✅ GOOD:
CODITECT-STANDARD-DOCUMENTATION-v1.0.0.md
API-REFERENCE-v2.3.1.md
```text

---

## YAML Frontmatter

### Required Fields

All major documentation files MUST include YAML frontmatter:

```yaml
---
title: "Document Title"
version: "1.0.0"
status: "production"
created: "2025-12-06"
last_updated: "2025-12-06"
author: "Author Name"
category: "documentation"
tags:
- "tag1"
- "tag2"
---
```text

### Standard Fields

**title** (required) - Human-readable document title
**version** (required) - Semantic version (MAJOR.MINOR.PATCH)
**status** (required) - One of: `draft`, `review`, `production`, `deprecated`
**created** (required) - ISO date (YYYY-MM-DD)
**last_updated** (required) - ISO date (YYYY-MM-DD)
**author** (optional) - Primary author
**category** (required) - Document category
**tags** (optional) - Search/filter tags
**related** (optional) - Links to related documents
**applies_to** (optional) - File patterns this applies to

### Standards-Specific Fields

For CODITECT standards documents:

```yaml
---
standard: "CODITECT-STANDARD-NAME"
version: "1.0.0"
status: "production"
enforcement: "required" # or "recommended", "optional"
validation:
- "tool1"
- "tool2"
related_standards:
- "OTHER-STANDARD.md"
---
```text

### Agentic Fields (Required for Agent-Accessible Docs)

Documents intended for AI agent consumption MUST include these additional fields:

```yaml
---
# Core metadata (existing required fields)
title: "User Quick Start"
version: "1.0.0"
status: "production"
created: "2025-12-21"
last_updated: "2025-12-21"
category: "getting-started"

# AGENTIC FIELDS (NEW - Required)
audience: customer # customer | contributor | both
type: guide # guide | reference | tutorial | spec | adr | claude-md
tokens: 1200 # estimated token count (use scripts/count-tokens.py)
summary: "10-minute setup guide for new CODITECT users to install and run first project"

# AGENTIC FIELDS (Recommended)
when_to_read: "User asks about setup, installation, getting started, or first project"
keywords:
- setup
- installation
- onboarding
- first-run
- quick-start
related:
- docs/guides/COMPONENT-ACTIVATION-GUIDE.md
- docs/guides/USER-BEST-PRACTICES.md
---
```text

### Agentic Field Definitions

| Field | Required | Type | Description |
|-------|----------|------|-------------|
| `audience` | Yes | enum | Target reader: `customer`, `contributor`, or `both` |
| `type` | Yes | enum | Document type: `guide`, `reference`, `tutorial`, `spec`, `adr`, `claude-md` |
| `tokens` | Yes | integer | Estimated token count (run `scripts/count-tokens.py FILE`) |
| `summary` | Yes | string | 1-2 sentence abstract (max 200 chars). Used for search and agent routing |
| `when_to_read` | Recommended | string | Natural language description of when an agent should read this doc |
| `keywords` | Recommended | array | Semantic search terms (3-10 keywords) |
| `related` | Recommended | array | Paths to related documents |

### Token Budget Guidelines

Documents MUST stay within token budgets to preserve context:

| Document Type | Max Tokens | Action if Over |
|---------------|------------|----------------|
| CLAUDE.md (root) | 1,500 | Split into linked docs |
| CLAUDE.md (subdir) | 800 | Reference parent, minimal content |
| Quick reference | 500 | Create summary + link to full doc |
| User guide | 2,000 | Split into chapters/sections |
| Tutorial | 2,500 | Break into sequential parts |
| Spec/ADR | 3,000 | Acceptable (read rarely, high value) |
| Reference doc | 4,000 | Create index + chunked sections |

**Token Estimation:**
```bash
# Count tokens in a file
python3 scripts/count-tokens.py docs/guides/USER-QUICK-START.md

# Bulk count for directory
python3 scripts/count-tokens.py docs/ --summary
```text

---

## Agentic Documentation Patterns

### Tiered Reading Strategy

Agents should read documents in tiers to minimize context consumption:

**Tier 1: Frontmatter Only (~50 tokens)**
- Read YAML frontmatter
- Check `audience`, `type`, `summary`, `when_to_read`
- Decide: skip, read summary, or read full

**Tier 2: Summary Section (~200 tokens)**
- Read Executive Summary or first paragraph
- Sufficient for most routing decisions

**Tier 3: Full Document (varies)**
- Only when document is confirmed relevant
- Budget remaining context accordingly

### Agent Routing via Frontmatter

Agents can route to correct documentation without reading full content:

```python
# Example: Agent decides whether to read a doc
def should_read_doc(frontmatter: dict, user_query: str) -> bool:
# Quick filters (no LLM needed)
if frontmatter.get('audience') == 'contributor' and user_is_customer:
return False
if frontmatter.get('tokens', 0) > remaining_context_budget:
return False # Too large for current context

# Keyword match
keywords = frontmatter.get('keywords', [])
if any(kw in user_query.lower() for kw in keywords):
return True

# Semantic match on summary (lightweight)
summary = frontmatter.get('summary', '')
return is_semantically_relevant(summary, user_query)
```text

### Database Integration (/cxq)

Frontmatter is extracted to SQLite for instant searchability:

**Schema: `documents` table**
```sql
CREATE TABLE documents (
id INTEGER PRIMARY KEY,
path TEXT UNIQUE NOT NULL,
title TEXT,
audience TEXT,
type TEXT,
tokens INTEGER,
summary TEXT,
when_to_read TEXT,
keywords TEXT, -- JSON array
related TEXT, -- JSON array
last_updated TEXT,
indexed_at TEXT DEFAULT CURRENT_TIMESTAMP
);

CREATE VIRTUAL TABLE documents_fts USING fts5(
title, summary, when_to_read, keywords,
content='documents',
content_rowid='id'
);
```text

**Query Examples:**
```bash
# Search by keyword
/cxq --docs "authentication setup"

# Filter by audience
/cxq --docs --audience customer "deployment"

# Find docs under token budget
/cxq --docs --max-tokens 1000 "getting started"

# List all guides
/cxq --docs --type guide
```text

### Document Index Generation

Generate `config/doc-index.json` from frontmatter:

```bash
# Generate index from all docs
python3 scripts/generate-doc-index.py

# Output: config/doc-index.json (~50KB for 100+ docs)
```text

**Index Format:**
```json
{
"generated_at": "2025-12-21T12:00:00Z",
"total_documents": 105,
"total_tokens": 142500,
"documents": [
{
"path": "docs/getting-started/USER-QUICK-START.md",
"title": "User Quick Start",
"audience": "customer",
"type": "guide",
"tokens": 1200,
"summary": "10-minute setup guide for new CODITECT users",
"when_to_read": "User asks about setup, installation, getting started",
"keywords": ["setup", "installation", "onboarding"]
}
],
"by_audience": {
"customer": 65,
"contributor": 35,
"both": 5
},
"by_type": {
"guide": 40,
"reference": 25,
"tutorial": 15,
"spec": 10,
"adr": 10,
"claude-md": 5
}
}
```text

### Progressive Implementation

**Phase A: Add Frontmatter (Current)**
1. Add agentic fields to high-traffic docs first
2. Validate with `scripts/validate-frontmatter.py`
3. Agents can grep frontmatter for quick routing

**Phase B: Generate Index (Next)**
1. Create `scripts/generate-doc-index.py`
2. Generate `config/doc-index.json`
3. Agents read single index file (~50KB) instead of 100+ files

**Phase C: Database Integration (Future)**
1. Add `documents` table to context.db
2. Integrate with `/cxq` command
3. Full-text search on summaries and keywords

---

## Content Guidelines

### Structure

Every documentation file SHOULD follow this structure:

```markdown
---

## YAML frontmatter

---

## Document Title

## Executive Summary (optional for long docs)

Brief overview of the document's purpose and key takeaways.

## Table of Contents (required for >500 lines)

- Section 1
- Section 2

---

## Section 1

Content here.

## Section 2

Content here.

---

## References (if applicable)

- [External Link](https://example.com)
- [Internal Guide](../guide.md)
```text

### Writing Style

## Use active voice

```markdown
❌ PASSIVE: "The configuration is loaded by the system"
✅ ACTIVE: "The system loads the configuration"
```text

## Use present tense for current state

```markdown
❌ PAST: "This function returned the result"
✅ PRESENT: "This function returns the result"
```text

## Be concise but complete

```markdown
❌ TOO BRIEF: "Use X"
❌ TOO VERBOSE: "In order to successfully utilize the X functionality, one must first ensure that..."
✅ JUST RIGHT: "Use the X function to process input data"
```text

## Use examples liberally

Every concept SHOULD have a code example:

```markdown

## Configuration

The system uses JSON configuration files:

```json
{
"apiKey": "your-key-here",
"timeout": 5000
}
```text

Load configuration with:

```python
config = load_config("config.json")
```text

```text

### Common Sections

**Quick Start** - Get users to success in <10 minutes

```markdown

## Quick Start

1. Install dependencies:
```bash
npm install
```text

1. Configure:

```bash
cp .env.example .env

# Edit .env with your settings

```text

1. Run:

```bash
npm start
```text

Expected output: Server running on <http://localhost:3000>

```text

**Prerequisites** - What users need before starting

```markdown

## Prerequisites

- Node.js 18+
- Git 2.25+
- PostgreSQL 14+

Optional:
- Docker (for containerized development)
```text

**Troubleshooting** - Common issues and solutions

```markdown

## Troubleshooting

### Error: "ECONNREFUSED"

**Problem:** Cannot connect to database.

## Solution

1. Verify database is running: `pg_isready`
2. Check connection string in `.env`
3. Ensure firewall allows connections on port 5432
```text

---

## Code Documentation

### Python Docstrings

## Use Google-style docstrings

```python
def process_data(input_file: str, output_file: str, validate: bool = True) -> dict:
"""Process data from input file and write to output file.

Reads structured data from the input file, applies transformations,
optionally validates the output, and writes results to the output file.

Args:
input_file: Path to input JSON file
output_file: Path to output JSON file
validate: Whether to validate output before writing (default: True)

Returns:
Dictionary containing:
- records_processed: Number of records processed
- validation_errors: List of validation errors (if validate=True)

Raises:
FileNotFoundError: If input_file does not exist
ValidationError: If validation fails and validate=True

Examples:
>>> result = process_data("input.json", "output.json")
>>> print(result["records_processed"])
42
"""
pass
```text

### JavaScript/TypeScript JSDoc

```javascript
/**
* Process data from input file and write to output file.
*
* @param {string} inputFile - Path to input JSON file
* @param {string} outputFile - Path to output JSON file
* @param {boolean} [validate=true] - Whether to validate output
* @returns {Promise<{recordsProcessed: number, validationErrors: string[]}>}
* @throws {Error} If input file does not exist
*
* @example
* const result = await processData("input.json", "output.json");
* console.log(result.recordsProcessed);
*/
async function processData(inputFile, outputFile, validate = true) {
// Implementation
}
```text

### Inline Comments

## When to comment

- WHY, not WHAT (code shows what, comments explain why)
- Complex algorithms
- Non-obvious business logic
- Temporary workarounds (with TODO/FIXME)

```python
❌ BAD (obvious):

## Increment counter

counter += 1

✅ GOOD (explains why):

## Skip validation for legacy data (pre-2024) to maintain compatibility

if record.date < "2024-01-01":
validate = False
```text

## TODO/FIXME format

```python

## TODO(username): Add input validation (issue #123)

## FIXME(username): Memory leak in long-running processes (issue #456)

```text

---

## Progressive Disclosure

### Claude.md Pattern

Claude.md files MUST follow progressive disclosure:

```markdown

## Project Name - AI Agent Context

## Essential Reading (Read First)

1. [README.md](README.md) - Project overview
2. [Quick Start](docs/01-getting-started/QUICK-START.md)
3. [Architecture](docs/02-architecture/ARCHITECTURE-OVERVIEW.md)

## Project Overview

Brief description (3-5 sentences).

## Directory Structure

```text

project/
├── src/ # Source code
├── docs/ # Documentation
└── tests/ # Test suite

```text

## Common Tasks

Quick reference for frequent operations.

## For More Details

Link to comprehensive documentation directories.
```text

### README.md Pattern

README.md follows the same progressive disclosure:

```markdown

## Project Name

Brief description (1-2 sentences).

## Quick Start

```bash

## 3-5 commands to get started

```text

## Features

- Feature 1
- Feature 2
- Feature 3

## Documentation

- [Installation](docs/installation.md)
- [User Guide](docs/user-guide.md)
- [API Reference](docs/api-reference.md)

## For More Information

See [docs/](docs/) for comprehensive documentation.

```text

---

## Validation & Quality

### Automated Validation

All documentation MUST pass:

1. **markdownlint-cli2** - Markdown syntax and style
2. **yamllint** - YAML frontmatter validation
3. **link-checker** - Broken link detection
4. **spell-checker** - Spelling validation

### Quality Metrics

## Minimum standards

- ✅ 95/100 markdownlint score
- ✅ Zero broken internal links
- ✅ YAML frontmatter valid
- ✅ <5 spelling errors per 1000 words

## Grading criteria

| Score | Grade | Status |
| ------- | ------- | -------- |
| 95-100 | A | Production-ready |
| 85-94 | B | Acceptable, improvements recommended |
| 75-84 | C | Needs improvement |
| <75 | F | Must be revised |

### Manual Review Checklist

- [ ] Clear and concise writing
- [ ] Accurate and up-to-date information
- [ ] Proper code examples with expected output
- [ ] Consistent terminology throughout
- [ ] Progressive disclosure (quick start → detailed)
- [ ] All links functional
- [ ] YAML frontmatter complete
- [ ] Proper file naming convention
- [ ] Appropriate directory location

---

## Automated Maintenance

### Documentation Linting

## Run before commit

```bash

## Lint all documentation

./scripts/lint-all.sh

## Auto-fix where possible

./scripts/lint-all.sh --fix

## Lint specific file

markdownlint-cli2 docs/guide.md
```text

### Git Hooks

**Pre-commit hook** validates documentation:

```bash
#!/bin/bash

## .git/hooks/pre-commit

## Run documentation linters

markdownlint-cli2 $(git diff --cached --name-only --diff-filter=ACM | grep '\.md$')

## Check YAML frontmatter

yamllint $(git diff --cached --name-only --diff-filter=ACM | grep '\.md$')
```text

### Automated Cleanup

## Documentation cleanup workflow

```bash

## Fix common issues

./scripts/docs/fix-markdown-issues.sh

## Update last_updated dates in frontmatter

./scripts/docs/update-frontmatter-dates.py

## Generate documentation index

./scripts/docs/generate-doc-index.py

## Validate all docs

./scripts/docs/validate-all-docs.sh
```text

### CI/CD Integration

## GitHub Actions workflow

```yaml
name: Documentation Quality

on: [push, pull_request]

jobs:
lint-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Lint Markdown
run: markdownlint-cli2 "**/*.md"
- name: Lint YAML
run: yamllint .
- name: Check Links
run: ./scripts/check-links.sh
```text

---

## Standards Compliance

This document complies with:

- CODITECT-STANDARD-DOCUMENTATION v1.0.0 (self-referential)
- CODITECT-STANDARD-README-MD v1.0.0
- CODITECT-STANDARD-Claude-MD v1.0.0

---

## Changelog

### v1.1.0 (2025-12-22)

- **Agentic Documentation Patterns** - New section for AI agent optimization
- Added agentic frontmatter fields: `audience`, `type`, `tokens`, `summary`, `when_to_read`, `keywords`
- Token budget guidelines by document type
- Tiered reading strategy for context preservation
- Database integration schema for `/cxq`
- Document index generation specification
- Progressive implementation roadmap (Phase A → B → C)

### v1.0.0 (2025-12-06)

- Initial release
- Comprehensive markdown standards
- Automated validation guidelines
- Progressive disclosure patterns
- Code documentation standards

---

## References

### Internal Standards

- [CODITECT-STANDARD-README-MD.md](CODITECT-STANDARD-README-MD.md)
- [CODITECT-STANDARD-Claude-MD.md](CODITECT-STANDARD-CLAUDE-MD.md)
- [README-TEMPLATE.md](TEMPLATES/README-TEMPLATE.md)

### External Resources

- [CommonMark Specification](https://commonmark.org/)
- [GitHub Flavored Markdown](https://github.github.com/gfm/)
- [markdownlint Rules](https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md)

---

**Last Updated:** 2025-12-22
**Version:** 1.1.0
**Status:** Production
**Maintainer:** Hal Casteel