ADR-161: Component Quality Assurance Framework

Status: Accepted Date: 2026-02-07 Author: Claude (Opus 4.6) Deciders: Hal Casteel

Context

CODITECT has 3,379+ components across 7 types (agents, skills, commands, hooks, scripts, workflows, tools) but no unified QA grading system. An initial agent QA report (Feb 7, 2026) revealed critical limitations:

Presence-only checking: Grading detected whether sections existed but not whether content was meaningful
Single-type coverage: Only agents had a grading script; 6 other types had none
Missing standards: Commands had no standard at all (363 commands ungoverned)
No content quality heuristics: No specificity scoring, code completeness detection, or link validation
No orchestration: No way to grade all component types together

Decision

Implement a layered Component QA Framework with type-specific graders, content quality heuristics, and unified orchestration.

Architecture

Layer 4: /qa command                    (user interface)
Layer 3: component-qa-reviewer agent    (orchestration + remediation)
Layer 2: qa-grading-framework skill     (patterns + methodology)
Layer 1: scripts/qa/grade-*.py          (7 type-specific execution engines)
Layer 0: coditect-core-standards/       (source of truth for criteria)
Shared:  scripts/qa/qa_common.py        (shared utilities)

Grading Matrix

Type	Cat 1	Cat 2	Cat 3	Cat 4	Cat 5
Agents	File Format (20%)	YAML Frontmatter (40%)	Instruction Quality (30%)	Documentation (10%)	-
Skills	YAML Frontmatter (40%)	Progressive Disclosure (25%)	Instruction Quality (25%)	File Structure (10%)	-
Commands	File Format (15%)	YAML Frontmatter (35%)	Specification Quality (30%)	Documentation (20%)	-
Hooks	Structure (20%)	Security (30%)	Performance (20%)	Integration (15%)	Documentation (15%)
Scripts	Structure (20%)	CLI Interface (20%)	Security (20%)	Error Handling (20%)	Documentation (20%)
Workflows	Prerequisites (15%)	Step Completeness (30%)	Examples (25%)	Integration (15%)	Troubleshooting (15%)
Tools	Features (20%)	Usage Examples (25%)	Architecture (20%)	Setup (20%)	Troubleshooting (15%)

Unified Grading Scale

Grade	Score	Meaning
A	90-100%	Production-ready, exemplary
B	80-89%	Production-ready, minor improvements
C	70-79%	Functional, moderate improvements
D	60-69%	Significant improvements needed
F	<60%	Does not meet minimum standards

Compliance Target: Grade B (80%) minimum within 30 days of standard publication.

Content Quality Heuristics

Beyond presence-checking, graders implement:

Specificity Score: Ratio of domain-specific terms vs generic words (threshold: 0.3)
Code Example Quality: Detect runnable code (imports, function calls) vs pseudocode/comments-only
Link Validation: Verify referenced files exist on disk
Staleness Detection: Compare updated frontmatter date vs file mtime (flag if >90 days stale)
Instruction Density: Imperative verbs per paragraph as effectiveness proxy

JSON Output Schema

All graders produce consistent JSON:

{
  "summary": {
    "total_components": 0,
    "average_score": 0.0,
    "grade_distribution": {"A": 0, "B": 0, "C": 0, "D": 0, "F": 0},
    "errors": 0
  },
  "attribute_pass_rates": {
    "attribute_name": {"passed": 0, "failed": 0, "rate": 0.0}
  },
  "components": [
    {
      "name": "component-name",
      "scores": {"A1_check_name": 1},
      "category_scores": {"A_category": 0.0},
      "total_base": 0.0,
      "grade": "A"
    }
  ],
  "errors": []
}

Consequences

Positive

All 3,379+ components graded with type-specific criteria
Content quality assessed beyond presence-checking
Unified JSON output enables dashboards and trend tracking
Standards serve as authoritative source of truth for grading criteria
Non-destructive: hook warns but doesn't block; --fix mode is opt-in

Negative

Content quality heuristics are approximations, not perfect measures
Initial run will surface many low-scoring components requiring remediation
Maintenance burden: standards changes require grader script updates

Risks

False positives: Heuristics may flag valid content as low-quality
Mitigation: Tunable thresholds and manual override capability

References

coditect-core-standards/coditect-standard-agents.md - Agent quality criteria
coditect-core-standards/coditect-standard-skills.md - Skill quality criteria
coditect-core-standards/coditect-standard-commands.md - Command quality criteria (NEW)
coditect-core-standards/coditect-standard-hooks.md - Hook quality criteria
coditect-core-standards/coditect-standard-scripts.md - Script quality criteria
coditect-core-standards/coditect-standard-workflows.md - Workflow quality criteria
docs/project-management/AGENT-QA-REPORT-2026-02-07.md - Initial agent QA report

Context​

Decision​

Architecture​

Grading Matrix​

Unified Grading Scale​

Content Quality Heuristics​

JSON Output Schema​

Consequences​

Positive​

Negative​

Risks​

References​