ADR-011: Self-Awareness Framework

Status: Accepted Date: 2025-12-20 Updated: 2026-02-03 (Migrated to coditect-core per ADR-150) Deciders: Hal Casteel, CODITECT Core Team Categories: Architecture, AI, Orchestration, Meta-cognition

Context

CODITECT is an agentic development framework with 3,458 components. For the system to operate autonomously, it must be self-aware - able to understand its own capabilities, make intelligent routing decisions, and learn from experience.

Current Limitations:

No Capability Discovery - Orchestrators don't know what components exist or what they do
No Situational Matching - Can't determine which component fits a given task
No Relationship Understanding - Don't know what works together
No Experience Learning - Don't learn from past successes/failures
Siloed Systems - Session history (/cxq), components, and workflows are disconnected

The Vision: CODITECT should be able to answer:

"What do I have that can help with this task?"
"Which option is best given these constraints?"
"What worked before in similar situations?"
"How do I combine capabilities for complex tasks?"

Decision

Implement a Self-Awareness Framework with four interconnected layers:

Layer 1: Capability Index (ADR-010)

Structured database of component capabilities, triggers, and relationships.

┌─────────────────────────────────────────────────────────────┐
│  CAPABILITY INDEX (platform.db - Tier 1)                    │
│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐           │
│  │Agents   │ │Commands │ │Skills   │ │Workflows│           │
│  │(152)    │ │(133)    │ │(221)    │ │(1149)   │           │
│  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘           │
│       └──────────┴──────────┴──────────┴───────────────────│
│                         ↓                                   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  SQLite + FTS5 (Full-Text Search)                    │   │
│  │  - Capabilities, triggers, relationships             │   │
│  │  - Use cases, quality metrics                        │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

Layer 2: Experience Memory

Integration with session history to learn from past usage.

ADR-118 Compliance: Experience data is distributed across tiers:

org.db (Tier 2): decisions, skill_learnings, error_solutions - IRREPLACEABLE

sessions.db (Tier 3): messages, tool_analytics, usage stats - REGENERABLE

┌─────────────────────────────────────────────────────────────┐
│  EXPERIENCE MEMORY                                          │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Session History (/cxq) - sessions.db (Tier 3)      │   │
│  │  - 100,000+ captured messages                        │   │
│  │  - Tool analytics, token economics                   │   │
│  └─────────────────────────────────────────────────────┘   │
│                         ↓                                   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Knowledge Base - org.db (Tier 2) CRITICAL          │   │
│  │  - Decisions, patterns, error solutions              │   │
│  │  - Skill learnings (what worked, what didn't)        │   │
│  └─────────────────────────────────────────────────────┘   │
│                         ↓                                   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Usage Statistics - sessions.db (Tier 3)            │   │
│  │  - Invocation counts per component                   │   │
│  │  - Success/failure rates                             │   │
│  │  - Duration metrics                                  │   │
│  │  - Co-occurrence patterns                            │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

Layer 3: Reasoning Engine

Decision-making logic for component selection and task routing.

┌─────────────────────────────────────────────────────────────┐
│  REASONING ENGINE                                           │
│                                                              │
│  Input: User Intent + Context                               │
│         ↓                                                    │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Intent Parser                                       │   │
│  │  - Extract actions, entities, constraints            │   │
│  │  - Classify domain, complexity, urgency              │   │
│  └─────────────────────────────────────────────────────┘   │
│         ↓                                                    │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Capability Matcher                                  │   │
│  │  - Query capability index (platform.db)              │   │
│  │  - Score relevance                                   │   │
│  │  - Check triggers (use_when, avoid_when)             │   │
│  └─────────────────────────────────────────────────────┘   │
│         ↓                                                    │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Experience Consulter                                │   │
│  │  - Query past usage (sessions.db)                    │   │
│  │  - Query decisions/learnings (org.db)                │   │
│  │  - Weight by success rate                            │   │
│  │  - Consider recency                                  │   │
│  └─────────────────────────────────────────────────────┘   │
│         ↓                                                    │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Decision Synthesizer                                │   │
│  │  - Rank options                                      │   │
│  │  - Consider relationships (what works together)      │   │
│  │  - Generate execution plan                           │   │
│  └─────────────────────────────────────────────────────┘   │
│         ↓                                                    │
│  Output: Recommended Component(s) + Configuration           │
└─────────────────────────────────────────────────────────────┘

Layer 4: Unified Query Interface

Single entry point for self-reflection queries.

# Discovery queries
/discover "security code review"           # Find by capability
/discover --use-case "HIPAA compliance"    # Find by use case
/discover --compare A B                    # Compare alternatives

# Experience queries
/discover --what-worked "GCP deployment"   # Past successes
/discover --similar-to last-task           # Similar patterns

# Relationship queries
/discover --works-with council-orchestrator  # Complements
/discover --instead-of code-reviewer         # Alternatives

# Via /cxq
/cxq --components "security"
/cxq --component-semantic "code review"
/cxq --decisions --recent 10
/cxq --patterns --language python

Self-Reflection Patterns

Pattern 1: Task-to-Component Mapping

def find_component_for_task(task_description: str) -> ComponentRecommendation:
    # 1. Parse intent
    intent = parse_intent(task_description)
    # → {action: "review", domain: "healthcare", constraints: ["compliance"]}

    # 2. Query capability index (platform.db - Tier 1)
    candidates = query_capabilities(
        actions=intent.action,
        domains=intent.domain,
        triggers=intent.constraints
    )
    # → [council-orchestrator, compliance-checker, code-reviewer]

    # 3. Filter by triggers
    filtered = [c for c in candidates if matches_triggers(c, intent)]
    # → [council-orchestrator, compliance-checker]

    # 4. Consult experience (org.db - Tier 2, sessions.db - Tier 3)
    ranked = rank_by_experience(filtered, intent)
    # → council-orchestrator (85% success for compliance tasks)

    # 5. Check relationships
    complementary = find_complements(ranked[0])
    # → [compliance-checker-agent, council-chairman]

    return ComponentRecommendation(
        primary=ranked[0],
        supporting=complementary,
        confidence=0.85,
        reasoning="Best match for compliance review based on triggers and 85% historical success"
    )

Pattern 2: Alternative Selection

def choose_alternative(task: str, constraints: dict) -> ComponentRecommendation:
    # User says: "Review this quickly, don't need full audit"

    # 1. Find candidates
    candidates = find_component_for_task("code review")

    # 2. Check avoid_when for each
    for c in candidates:
        if constraint_conflicts(c.avoid_when, constraints):
            c.score -= 0.3  # Penalize

    # council-orchestrator.avoid_when = "quick feedback needed" → penalized
    # code-reviewer.use_when = "quick feedback" → boosted

    # 3. Re-rank
    return rank_and_recommend(candidates)
    # → code-reviewer (better fit for "quick" constraint)

Pattern 3: Experience-Based Learning

def learn_from_execution(component_id: str, outcome: ExecutionOutcome):
    # After each component invocation:

    # 1. Update usage stats (sessions.db - Tier 3)
    update_stats(component_id,
        success=outcome.success,
        duration=outcome.duration_ms
    )

    # 2. Record co-occurrences
    for other in outcome.co_invoked_components:
        record_co_occurrence(component_id, other)

    # 3. Update confidence
    new_confidence = recalculate_confidence(component_id)
    update_component(component_id, confidence=new_confidence)

    # 4. Log to session history (sessions.db - Tier 3)
    capture_to_session({
        "type": "component_execution",
        "component": component_id,
        "outcome": outcome,
        "context": outcome.task_description
    })

    # 5. If significant learning, store in org.db (Tier 2)
    if outcome.contains_decision or outcome.error_solution:
        store_learning(org_db, outcome)

Pattern 4: Composite Task Decomposition

def decompose_complex_task(task: str) -> ExecutionPlan:
    # User says: "Launch new feature with docs, tests, and security review"

    # 1. Identify sub-tasks
    sub_tasks = extract_sub_tasks(task)
    # → ["create documentation", "run tests", "security review"]

    # 2. Find component for each
    components = [find_component_for_task(st) for st in sub_tasks]
    # → [documentation-librarian, testing-specialist, council-orchestrator]

    # 3. Check relationships for conflicts
    conflicts = find_conflicts(components)

    # 4. Determine parallelization
    parallel_groups = analyze_dependencies(components)

    # 5. Generate plan
    return ExecutionPlan(
        steps=[
            ParallelStep([documentation-librarian, testing-specialist]),
            SequentialStep([council-orchestrator])  # After tests pass
        ],
        estimated_duration="2 hours",
        checkpoints=["after_docs", "after_tests", "after_review"]
    )

Integration Points

With Existing Systems

System	Integration
`/cxq`	Query session history for experience patterns
`use-case-analyzer`	Primary consumer of self-awareness APIs
`orchestrator`	Uses capability matching for agent selection
`workflow-orchestrator`	Uses relationships for step-to-agent mapping
Component activation	Tracks which components are available

New Commands

Command	Purpose
`/discover`	Unified self-awareness query interface
`/reflect`	Show reasoning for last component selection
`/learn`	Manually record outcome for learning

Auto-Trigger Integration

auto_trigger_integration:
  on_task_received:
    - capability-matcher      # Find relevant components
    - experience-consulter    # Check past patterns

  on_execution_complete:
    - usage-tracker          # Record statistics
    - learning-updater       # Update confidence scores

Rationale

Why Self-Awareness?

Without	With
Orchestrator asks user which agent to use	Orchestrator discovers appropriate agent
Static routing rules	Dynamic capability-based routing
No learning	Improves with usage
Siloed knowledge	Unified discovery

Why Four Layers?

Capability Index - Foundation, must exist first
Experience Memory - Adds learning dimension
Reasoning Engine - Intelligence layer
Query Interface - User/agent interaction

Each layer builds on the previous. Can implement incrementally.

Why Not Just LLM-Based Discovery?

Approach	Pros	Cons
LLM reads files	Flexible	Slow, expensive, inconsistent
Structured index	Fast, consistent	Requires maintenance
Hybrid	Best of both	More complexity

Decision: Structured index with LLM for natural language translation. The index provides reliable, fast discovery. LLM translates user intent to structured queries.

Consequences

Positive

Autonomous Operation - Orchestrators can self-discover resources
Better Quality - Right tool for the job, based on experience
Reduced Errors - Know what exists, avoid hallucination
Continuous Improvement - Learn from every execution
Transparency - Can explain why a component was chosen

Negative

Complexity - Four layers to build and maintain
Cold Start - No experience data initially
Index Staleness - Must keep in sync with components

Mitigations

Risk	Mitigation
Complexity	Incremental implementation, start with Layer 1
Cold start	Seed with expert knowledge, learn quickly
Staleness	Automated indexer on file changes

Implementation Status

Status: Phase 1-2 Complete (2026-02-03)

Current Statistics

Layer	Status	Details
Capability Index	Complete	3,458+ components indexed in platform.db
Unified Database	Complete	ADR-118 Four-Tier architecture
Semantic Search	Complete	100% embedding coverage
Query Interface	Complete	`/cxq --components`, `/cxq --component-semantic`
Experience Memory	Schema Ready	org.db + sessions.db ready
Reasoning Engine	Pending	Phase 3

Implementation Roadmap

Phase 1: Foundation (Weeks 1-2) - COMPLETE

Implement Component Capability Schema (ADR-010)
Build component indexer (scripts/component-indexer.py)
Create SQLite database with FTS5 (platform.db - Tier 1)
Basic /discover command
ADR-118 Four-Tier database architecture

Phase 2: Experience (Weeks 3-4) - COMPLETE

Integrate with /cxq for unified search
Build component search APIs (search_components(), semantic_search_components())
Add semantic embeddings for intent-based discovery
Add usage tracking hooks (pending)
Update stats on execution (pending)

Phase 3: Reasoning (Weeks 5-6) - PENDING

Implement intent parser
Build capability matcher
Create decision synthesizer
Add explanation generation

Phase 4: Integration (Weeks 7-8) - PENDING

Update use-case-analyzer to use self-awareness
Integrate with all orchestrators
Add auto-trigger hooks
Performance optimization

Success Metrics

Metric	Target	Measurement
Discovery accuracy	>85%	Correct component for task
Query latency	<500ms	Time to return recommendations
Learning improvement	+10%/month	Success rate increase
Coverage	100%	Components indexed
Usage adoption	>50%	Orchestrator queries using self-awareness

ADR-010 - Component Capability Schema (foundation)
ADR-012 - LLM Council Pattern (uses self-awareness for reviewer selection)
ADR-020 - Context Extraction
ADR-021 - Context Query (session history integration)
ADR-118 - Four-Tier Database Architecture

References

scripts/component-indexer.py - Component indexer
use-case-analyzer agent - Primary consumer
uncertainty-orchestrator agent - MoE integration

Approved: 2025-12-20 Migrated: 2026-02-03 (ADR-150) Review Date: 2026-06-20

Context​

Decision​

Layer 1: Capability Index (ADR-010)​

Layer 2: Experience Memory​

Layer 3: Reasoning Engine​

Layer 4: Unified Query Interface​

Self-Reflection Patterns​

Pattern 1: Task-to-Component Mapping​

Pattern 2: Alternative Selection​

Pattern 3: Experience-Based Learning​

Pattern 4: Composite Task Decomposition​

Integration Points​

With Existing Systems​

New Commands​

Auto-Trigger Integration​

Rationale​

Why Self-Awareness?​

Why Four Layers?​

Why Not Just LLM-Based Discovery?​

Consequences​

Positive​

Negative​

Mitigations​

Implementation Status​

Current Statistics​

Implementation Roadmap​

Phase 1: Foundation (Weeks 1-2) - COMPLETE​

Phase 2: Experience (Weeks 3-4) - COMPLETE​

Phase 3: Reasoning (Weeks 5-6) - PENDING​

Phase 4: Integration (Weeks 7-8) - PENDING​

Success Metrics​

Related ADRs​

References​