Skip to main content

Document Frontmatter Quality Report

Date: 2026-02-17 Task: J.20 (Document Taxonomy System) Scope: 1,446 documents indexed with taxonomy-aware frontmatter Database: platform.db (document_frontmatter table)


Grade Distribution

GradeCountPercentageDescription
A (90-100)49934.5%Excellent — all key fields present
B (70-89)71749.6%Good — minor missing fields
C (50-69)17211.9%Fair — multiple missing fields
D (30-49)513.5%Poor — major fields missing
F (<30)70.5%Critical — external artifacts

84.1% at B or above. 15.9% need attention.


Issue Frequency

IssueCount% of DocsSeverityAuto-fixable?
MISSING_AUDIENCE89161.6%MediumYes (directory-based inference)
MISSING_KEYWORDS87860.7%LowYes (content extraction)
MISSING_SUMMARY26418.3%HighPartial (AI-generated)
DRAFT_STATUS15010.4%LowNo (intentional)
MISSING_STATUS876.0%MediumYes (default: active)
MISSING_TYPE644.4%MediumYes (directory/filename)
AUTO_SUMMARY614.2%MediumPartial (need real summaries)
MISSING_TITLE584.0%CriticalYes (first heading / filename)
LOW_MOE_CONFIDENCE30.2%LowNo (classifier limitation)

Average Quality by Category

CategoryAvg ScoreDoc Count
Analysis93.443
Guides93.273
Research87.0171
Architecture86.2363
Business85.06
Reference83.2540
Operations81.136
Planning77.0214

Fix Plan: Priority Order

Priority 1: Missing Titles (49 core docs) — CRITICAL

Titles are displayed in sidebar navigation. Missing title = confusing UX.

Auto-fix strategy: Read first # Heading from document content.

Key files:

  • 12 TRACK plan files (internal/project/plans/tracks/)
  • 12 coditect-core-standards files
  • 10 PCF track files
  • 8 ADR files
  • 7 other internal docs

Priority 2: Missing/Auto-classified Summaries (258 core + 61 auto = 319 total)

Summaries appear in search results and tooltips.

Auto-fix strategy: Extract first non-heading paragraph (truncated to 200 chars).

By directory:

  • internal/architecture/: 70 docs
  • internal/project/: 45 docs
  • templates/business-documents/: 37 docs
  • coditect-core-standards/: 12 docs
  • config/templates/: 10 docs

Priority 3: Missing Audience (891 docs)

Audience field controls badge colors in viewer.

Auto-fix strategy: Map from directory path:

  • internal/* → "contributor"
  • docs/* → "user"
  • docs/reference/* → "technical"
  • templates/* → "technical"
  • coditect-core-standards/* → "contributor"

Priority 4: Missing Keywords (878 docs)

Keywords boost search relevance.

Auto-fix strategy: Extract from document content — directory name, frontmatter type, first 3-5 significant words from title.

Priority 5: Missing Status (87 docs)

Auto-fix: Default to "active" for all.


External Artifacts (Skip)

67 documents under analyze-new-artifacts/ and codanna/ are third-party research artifacts. Their frontmatter quality is not actionable — they come from external sources.


Recommendation

  1. Build scripts/fix-frontmatter.py to batch-fix Priorities 1-5
  2. Run with --dry-run first to preview changes
  3. Validate with component-indexer.py --frontmatter-stats after
  4. Re-generate publish.json to verify sidebar quality
  5. Skip external artifacts (analyze-new-artifacts/, codanna/)

Estimated effort: ~1 hour automated, ~30 min review Impact: Raises A+B from 84% to ~95%+