Skip to main content

ADR-010: Component Capability Schema

Status: Accepted Date: 2025-12-20 Updated: 2026-02-03 (Migrated to coditect-core per ADR-150) Deciders: Hal Casteel, CODITECT Core Team Categories: Architecture, Self-Awareness, Discovery


Context

CODITECT has grown to 3,458 components across 7 types:

  • 776 Agents
  • 377 Commands
  • 445 Skills
  • 581 Scripts
  • 118 Hooks
  • 1,153 Workflows
  • 8 Prompts

The Problem: No unified way for the system to understand its own capabilities.

Current state:

  • Components are markdown files on disk
  • No structured capability metadata
  • No searchable index
  • Orchestrators cannot discover "what can handle X?"
  • No learning from past usage patterns

Questions the system cannot answer:

  1. "What component handles security code review?"
  2. "When should I use council-orchestrator vs code-reviewer?"
  3. "What inputs does this component need?"
  4. "What worked before for similar tasks?"

Decision

Implement a unified Component Capability Schema that enables CODITECT self-awareness through structured metadata and indexed search.

Schema Structure

Every component will have standardized metadata across 7 dimensions:

component:
# Identity - What is it?
id: "council-orchestrator"
type: "agent"
name: "Council Orchestrator"
version: "1.0.0"
status: "operational"
path: "agents/council-orchestrator.md"

# Capabilities - What can it do?
capabilities:
primary: ["coordinate multi-agent review", "generate verdicts"]
tags: [code-review, multi-agent, consensus, compliance]
domains: [software-development, quality-assurance]
actions: [review, evaluate, synthesize]

# Triggers - When to use it?
triggers:
use_when: ["compliance audit required", "multi-perspective needed"]
avoid_when: ["quick feedback needed", "simple change"]
keywords: ["council review", "peer review", "consensus"]
complexity: "high"

# Interface - How to use it?
interface:
inputs: [{name: "target", type: "file_path", required: true}]
outputs: [{name: "verdict", type: "CouncilVerdict"}]

# Relationships - What works together?
relationships:
invokes: ["security-specialist", "council-chairman"]
alternatives: [{id: "code-reviewer", difference: "faster, less thorough"}]
complements: ["council-review skill"]

# Quality - How reliable?
quality:
maturity: "production"
confidence: 0.85
documentation_quality: "complete"

# Use Cases - What scenarios?
use_cases:
- name: "HIPAA Compliance Review"
scenario: "Healthcare code with PHI handling"
configuration: {compliance: ["hipaa"]}

Storage: SQLite with FTS5 (ADR-118 Tier 1)

Database: platform.db (Tier 1 - Component metadata, regenerable) Current Size: ~3 MB Indexed Components: 3,458+

ADR-118 Compliance: Component index stored in platform.db (Tier 1). Component metadata is regenerable from source files.

-- Core component registry (IMPLEMENTED)
CREATE TABLE components (
id TEXT PRIMARY KEY,
type TEXT NOT NULL, -- agent, command, skill, script, hook, workflow
name TEXT NOT NULL,
version TEXT,
status TEXT DEFAULT 'operational', -- operational, beta, deprecated, experimental
path TEXT NOT NULL,
category TEXT,
subcategory TEXT,
description TEXT, -- Extracted from component file
complexity TEXT DEFAULT 'medium', -- low, medium, high, enterprise
maturity TEXT DEFAULT 'production', -- experimental, beta, production, stable
confidence REAL DEFAULT 0.5,
documentation_quality TEXT DEFAULT 'partial',
content_hash TEXT, -- SHA256 for incremental updates
indexed_at TEXT,
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
updated_at TEXT DEFAULT CURRENT_TIMESTAMP
);

-- Component capabilities (IMPLEMENTED - 11,412 rows)
CREATE TABLE capabilities (
id INTEGER PRIMARY KEY AUTOINCREMENT,
component_id TEXT REFERENCES components(id) ON DELETE CASCADE,
capability TEXT NOT NULL,
capability_type TEXT NOT NULL -- primary, tag, domain, action
);

-- Usage triggers (IMPLEMENTED - 4,190 rows)
CREATE TABLE triggers (
id INTEGER PRIMARY KEY AUTOINCREMENT,
component_id TEXT REFERENCES components(id) ON DELETE CASCADE,
trigger_type TEXT NOT NULL, -- use_when, avoid_when, keyword
description TEXT NOT NULL
);

-- Inter-component relationships (IMPLEMENTED - 3,421 rows)
CREATE TABLE relationships (
id INTEGER PRIMARY KEY AUTOINCREMENT,
source_id TEXT REFERENCES components(id) ON DELETE CASCADE,
target_id TEXT,
relationship_type TEXT NOT NULL, -- invokes, invoked_by, alternative, complement
notes TEXT
);

-- Usage statistics (stored in sessions.db - Tier 3)
-- See ADR-118 for storage tier details

-- Full-text search (IMPLEMENTED)
CREATE VIRTUAL TABLE component_search USING fts5(
id, name, type, description, capabilities, triggers
);

-- B-Tree indexes for query performance
CREATE INDEX idx_components_type ON components(type);
CREATE INDEX idx_components_category ON components(category);
CREATE INDEX idx_components_status ON components(status);
CREATE INDEX idx_capabilities_component ON capabilities(component_id);
CREATE INDEX idx_capabilities_type ON capabilities(capability_type);
CREATE INDEX idx_triggers_component ON triggers(component_id);
CREATE INDEX idx_relationships_source ON relationships(source_id);
CREATE INDEX idx_relationships_target ON relationships(target_id);

Query Interface

Integrated with /cxq and /discover commands:

# Find components by capability
/discover "security code review"

# Compare alternatives
/discover --compare council-orchestrator code-reviewer

# Find by use case
/discover --use-case "HIPAA compliance review"

# Show relationships
/discover --relationships council-orchestrator

# Via /cxq
/cxq --components "security"
/cxq --component-semantic "review code for vulnerabilities"

Rationale

Why Unified Schema?

ApproachProsCons
Per-type schemasType-specific fieldsNo cross-type discovery
Unified schemaSingle search, relationshipsSome fields not applicable
No schema (current)No maintenanceNo discoverability

Decision: Unified schema with type-specific extensions. The benefits of cross-type discovery outweigh the cost of some unused fields.

Why SQLite + FTS5?

OptionProsCons
SQLite + FTS5Fast, embedded, full-textSingle-node only
PostgreSQLScalable, advanced queriesExternal dependency
ElasticsearchBest searchHeavy, complex
JSON filesSimpleSlow search, no relationships

Decision: SQLite + FTS5. Matches existing /cxq infrastructure. 3,458 components is well within SQLite's sweet spot. Can migrate later if needed.

Why Include Usage Stats?

Self-awareness requires learning from experience:

  • Which components succeed most often?
  • Which are slow/fast?
  • Which are used together?

This enables recommendations: "For GCP deployments, devops-engineer has 89% success rate over 45 invocations."


Consequences

Positive

  1. Self-Discovery - Orchestrators can find appropriate components at runtime
  2. Better Routing - Match tasks to components by capability, not just name
  3. Relationship Awareness - Know what works together
  4. Experience Learning - Improve over time based on usage
  5. Reduced Hallucination - Agents know what actually exists

Negative

  1. Maintenance Overhead - Schema must be kept in sync with components
  2. Indexing Cost - Initial extraction from 3,458+ files
  3. Schema Evolution - Changes require migration

Mitigations

RiskMitigation
Schema driftAutomated indexer runs on component changes
Missing metadataGraceful degradation, infer from content
Query complexityNatural language → SQL translation layer

Implementation Status

Status: Phase 1-2 Complete, Phase 3-4 In Progress

Current Statistics (2026-02-03)

MetricValue
Database Size~3 MB
Components Indexed3,458+
Capabilities Extracted15,000+
Triggers Extracted5,000+
Relationships Mapped4,000+
Semantic Embeddings100% coverage

Implementation Artifacts

ArtifactLocationStatus
Component Indexerscripts/component-indexer.pyComplete
Databaseplatform.db (Tier 1)Populated
Query Interface/cxq --components, /discoverComplete
Schema SpecThis ADRComplete

Implementation Plan

Phase 1: Schema & Indexer (Week 1) - COMPLETE

  • Finalize schema specification
  • Create component indexer script (scripts/component-indexer.py)
  • Extract capabilities from existing files
  • Build SQLite database with FTS5
  • Add B-tree indexes for performance

Phase 2: Query Interface (Week 2) - COMPLETE

  • Add /discover command
  • Implement FTS5 queries via /cxq --components
  • Build comparison views
  • Integration with /cxq
  • Semantic search with embeddings

Phase 3: Integration (Week 3) - IN PROGRESS

  • Update use-case-analyzer to use capability index
  • Add usage tracking hooks
  • Integrate with orchestrators

Phase 4: Learning Loop (Week 4) - PENDING

  • Track invocations from session history
  • Update success rates automatically
  • Surface recommendations

Alternatives Considered

1. Extend Existing Activation Status

Use component-activation-status.json for capabilities.

Rejected: File is already 65K+ tokens. Adding capabilities would make it unwieldy. Better to separate concerns.

2. Embed Capabilities in Component Files

Add YAML frontmatter to all component markdown files.

Partially Accepted: Components SHOULD have frontmatter, but we still need a searchable index. This is complementary.

3. Use LLM for Discovery

Ask Claude to search component files on each query.

Rejected: Too slow, token-expensive, inconsistent. Structured index is more reliable.


  • ADR-011 - Self-Awareness Framework (higher-level patterns)
  • ADR-012 - LLM Council Pattern (uses this schema)
  • ADR-020 - Context Extraction (complementary system)
  • ADR-021 - Context Query
  • ADR-118 - Four-Tier Database Architecture

References


Approved: 2025-12-20 Migrated: 2026-02-03 (ADR-150) Review Date: 2026-06-20