ADR-010: Component Capability Schema

Status: Accepted Date: 2025-12-20 Updated: 2026-02-03 (Migrated to coditect-core per ADR-150) Deciders: Hal Casteel, CODITECT Core Team Categories: Architecture, Self-Awareness, Discovery

Context

CODITECT has grown to 3,458 components across 7 types:

776 Agents
377 Commands
445 Skills
581 Scripts
118 Hooks
1,153 Workflows
8 Prompts

The Problem: No unified way for the system to understand its own capabilities.

Current state:

Components are markdown files on disk
No structured capability metadata
No searchable index
Orchestrators cannot discover "what can handle X?"
No learning from past usage patterns

Questions the system cannot answer:

"What component handles security code review?"
"When should I use council-orchestrator vs code-reviewer?"
"What inputs does this component need?"
"What worked before for similar tasks?"

Decision

Implement a unified Component Capability Schema that enables CODITECT self-awareness through structured metadata and indexed search.

Schema Structure

Every component will have standardized metadata across 7 dimensions:

component:
  # Identity - What is it?
  id: "council-orchestrator"
  type: "agent"
  name: "Council Orchestrator"
  version: "1.0.0"
  status: "operational"
  path: "agents/council-orchestrator.md"

  # Capabilities - What can it do?
  capabilities:
    primary: ["coordinate multi-agent review", "generate verdicts"]
    tags: [code-review, multi-agent, consensus, compliance]
    domains: [software-development, quality-assurance]
    actions: [review, evaluate, synthesize]

  # Triggers - When to use it?
  triggers:
    use_when: ["compliance audit required", "multi-perspective needed"]
    avoid_when: ["quick feedback needed", "simple change"]
    keywords: ["council review", "peer review", "consensus"]
    complexity: "high"

  # Interface - How to use it?
  interface:
    inputs: [{name: "target", type: "file_path", required: true}]
    outputs: [{name: "verdict", type: "CouncilVerdict"}]

  # Relationships - What works together?
  relationships:
    invokes: ["security-specialist", "council-chairman"]
    alternatives: [{id: "code-reviewer", difference: "faster, less thorough"}]
    complements: ["council-review skill"]

  # Quality - How reliable?
  quality:
    maturity: "production"
    confidence: 0.85
    documentation_quality: "complete"

  # Use Cases - What scenarios?
  use_cases:
    - name: "HIPAA Compliance Review"
      scenario: "Healthcare code with PHI handling"
      configuration: {compliance: ["hipaa"]}

Storage: SQLite with FTS5 (ADR-118 Tier 1)

Database: platform.db (Tier 1 - Component metadata, regenerable) Current Size: ~3 MB Indexed Components: 3,458+

ADR-118 Compliance: Component index stored in platform.db (Tier 1). Component metadata is regenerable from source files.

-- Core component registry (IMPLEMENTED)
CREATE TABLE components (
    id TEXT PRIMARY KEY,
    type TEXT NOT NULL,                    -- agent, command, skill, script, hook, workflow
    name TEXT NOT NULL,
    version TEXT,
    status TEXT DEFAULT 'operational',     -- operational, beta, deprecated, experimental
    path TEXT NOT NULL,
    category TEXT,
    subcategory TEXT,
    description TEXT,                      -- Extracted from component file
    complexity TEXT DEFAULT 'medium',      -- low, medium, high, enterprise
    maturity TEXT DEFAULT 'production',    -- experimental, beta, production, stable
    confidence REAL DEFAULT 0.5,
    documentation_quality TEXT DEFAULT 'partial',
    content_hash TEXT,                     -- SHA256 for incremental updates
    indexed_at TEXT,
    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
    updated_at TEXT DEFAULT CURRENT_TIMESTAMP
);

-- Component capabilities (IMPLEMENTED - 11,412 rows)
CREATE TABLE capabilities (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    component_id TEXT REFERENCES components(id) ON DELETE CASCADE,
    capability TEXT NOT NULL,
    capability_type TEXT NOT NULL          -- primary, tag, domain, action
);

-- Usage triggers (IMPLEMENTED - 4,190 rows)
CREATE TABLE triggers (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    component_id TEXT REFERENCES components(id) ON DELETE CASCADE,
    trigger_type TEXT NOT NULL,            -- use_when, avoid_when, keyword
    description TEXT NOT NULL
);

-- Inter-component relationships (IMPLEMENTED - 3,421 rows)
CREATE TABLE relationships (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    source_id TEXT REFERENCES components(id) ON DELETE CASCADE,
    target_id TEXT,
    relationship_type TEXT NOT NULL,       -- invokes, invoked_by, alternative, complement
    notes TEXT
);

-- Usage statistics (stored in sessions.db - Tier 3)
-- See ADR-118 for storage tier details

-- Full-text search (IMPLEMENTED)
CREATE VIRTUAL TABLE component_search USING fts5(
    id, name, type, description, capabilities, triggers
);

-- B-Tree indexes for query performance
CREATE INDEX idx_components_type ON components(type);
CREATE INDEX idx_components_category ON components(category);
CREATE INDEX idx_components_status ON components(status);
CREATE INDEX idx_capabilities_component ON capabilities(component_id);
CREATE INDEX idx_capabilities_type ON capabilities(capability_type);
CREATE INDEX idx_triggers_component ON triggers(component_id);
CREATE INDEX idx_relationships_source ON relationships(source_id);
CREATE INDEX idx_relationships_target ON relationships(target_id);

Query Interface

Integrated with /cxq and /discover commands:

# Find components by capability
/discover "security code review"

# Compare alternatives
/discover --compare council-orchestrator code-reviewer

# Find by use case
/discover --use-case "HIPAA compliance review"

# Show relationships
/discover --relationships council-orchestrator

# Via /cxq
/cxq --components "security"
/cxq --component-semantic "review code for vulnerabilities"

Rationale

Why Unified Schema?

Approach	Pros	Cons
Per-type schemas	Type-specific fields	No cross-type discovery
Unified schema	Single search, relationships	Some fields not applicable
No schema (current)	No maintenance	No discoverability

Decision: Unified schema with type-specific extensions. The benefits of cross-type discovery outweigh the cost of some unused fields.

Why SQLite + FTS5?

Option	Pros	Cons
SQLite + FTS5	Fast, embedded, full-text	Single-node only
PostgreSQL	Scalable, advanced queries	External dependency
Elasticsearch	Best search	Heavy, complex
JSON files	Simple	Slow search, no relationships

Decision: SQLite + FTS5. Matches existing /cxq infrastructure. 3,458 components is well within SQLite's sweet spot. Can migrate later if needed.

Why Include Usage Stats?

Self-awareness requires learning from experience:

Which components succeed most often?
Which are slow/fast?
Which are used together?

This enables recommendations: "For GCP deployments, devops-engineer has 89% success rate over 45 invocations."

Consequences

Positive

Self-Discovery - Orchestrators can find appropriate components at runtime
Better Routing - Match tasks to components by capability, not just name
Relationship Awareness - Know what works together
Experience Learning - Improve over time based on usage
Reduced Hallucination - Agents know what actually exists

Negative

Maintenance Overhead - Schema must be kept in sync with components
Indexing Cost - Initial extraction from 3,458+ files
Schema Evolution - Changes require migration

Mitigations

Risk	Mitigation
Schema drift	Automated indexer runs on component changes
Missing metadata	Graceful degradation, infer from content
Query complexity	Natural language → SQL translation layer

Implementation Status

Status: Phase 1-2 Complete, Phase 3-4 In Progress

Current Statistics (2026-02-03)

Metric	Value
Database Size	~3 MB
Components Indexed	3,458+
Capabilities Extracted	15,000+
Triggers Extracted	5,000+
Relationships Mapped	4,000+
Semantic Embeddings	100% coverage

Implementation Artifacts

Artifact	Location	Status
Component Indexer	`scripts/component-indexer.py`	Complete
Database	`platform.db` (Tier 1)	Populated
Query Interface	`/cxq --components`, `/discover`	Complete
Schema Spec	This ADR	Complete

Implementation Plan

Phase 1: Schema & Indexer (Week 1) - COMPLETE

Finalize schema specification
Create component indexer script (scripts/component-indexer.py)
Extract capabilities from existing files
Build SQLite database with FTS5
Add B-tree indexes for performance

Phase 2: Query Interface (Week 2) - COMPLETE

Add /discover command
Implement FTS5 queries via /cxq --components
Build comparison views
Integration with /cxq
Semantic search with embeddings

Phase 3: Integration (Week 3) - IN PROGRESS

Update use-case-analyzer to use capability index
Add usage tracking hooks
Integrate with orchestrators

Phase 4: Learning Loop (Week 4) - PENDING

Track invocations from session history
Update success rates automatically
Surface recommendations

Alternatives Considered

1. Extend Existing Activation Status

Use component-activation-status.json for capabilities.

Rejected: File is already 65K+ tokens. Adding capabilities would make it unwieldy. Better to separate concerns.

2. Embed Capabilities in Component Files

Add YAML frontmatter to all component markdown files.

Partially Accepted: Components SHOULD have frontmatter, but we still need a searchable index. This is complementary.

3. Use LLM for Discovery

Ask Claude to search component files on each query.

Rejected: Too slow, token-expensive, inconsistent. Structured index is more reliable.

ADR-011 - Self-Awareness Framework (higher-level patterns)
ADR-012 - LLM Council Pattern (uses this schema)
ADR-020 - Context Extraction (complementary system)
ADR-021 - Context Query
ADR-118 - Four-Tier Database Architecture

References

scripts/component-indexer.py - Component indexer
use-case-analyzer agent - Primary consumer
config/component-counts.json - Component inventory

Approved: 2025-12-20 Migrated: 2026-02-03 (ADR-150) Review Date: 2026-06-20

Context​

Decision​

Schema Structure​

Storage: SQLite with FTS5 (ADR-118 Tier 1)​

Query Interface​

Rationale​

Why Unified Schema?​

Why SQLite + FTS5?​

Why Include Usage Stats?​

Consequences​

Positive​

Negative​

Mitigations​

Implementation Status​

Current Statistics (2026-02-03)​

Implementation Artifacts​

Implementation Plan​

Phase 1: Schema & Indexer (Week 1) - COMPLETE​

Phase 2: Query Interface (Week 2) - COMPLETE​

Phase 3: Integration (Week 3) - IN PROGRESS​

Phase 4: Learning Loop (Week 4) - PENDING​

Alternatives Considered​

1. Extend Existing Activation Status​

2. Embed Capabilities in Component Files​

3. Use LLM for Discovery​

Related ADRs​

References​