ADR-162: Progressive Component Disclosure Architecture

Status

PROPOSED (2026-02-07)

Context

The Product Problem

CODITECT is a product framework delivered to customers. Each installation includes 3,392 components (770 agents, 431 skills, 364 commands). Claude Code loads all agent descriptions into the Task tool's subagent_type schema on every API call, consuming ~25.6k tokens -- 70% above the recommended 15k budget.

This is a product quality and scalability issue:

Every customer session pays the full 25.6k token overhead
Agent selection accuracy degrades catastrophically at 770 agents (estimated 20-40% accuracy vs. 90%+ at 25 agents)
Context window available for customer work is reduced by 13% (growing to 37% by Q3 2026)
Competitive disadvantage vs. GitHub Copilot (13 core tools), Cursor (RAG-based), and other tools using progressive disclosure

Loading Mechanism

Claude Code loads custom agents from the .claude/agents/ directory. In CODITECT:

.claude -> .coditect -> submodules/core/coditect-core/

All 770 agent .md files in coditect-core/agents/ are loaded at session start. Claude Code extracts name + description from frontmatter into the Task tool's function schema (~33 tokens/agent x 770 = ~25.6k tokens). The full agent body (~589 tokens avg) is loaded only on dispatch.

There is no configuration mechanism for selective loading. All .md files in .claude/agents/ are loaded unconditionally.

Dual Dispatch Mechanisms

CODITECT has two parallel agent dispatch paths:

Claude Code native: Task(subagent_type="agent-name", prompt="...") -- directly from Task schema
CODITECT proxy: Task(subagent_type="general-purpose", prompt="...") via invoke-agent.py -- injects full agent system prompt

Both coexist. The proxy pattern provides flexibility but does not reduce schema cost.

Impact Quantified

Metric	Current	At Scale (Q3 2026)
Agent description tokens	25,600	75,000+
Context consumed (of 200k)	13%	37%
Agent selection accuracy	~20-40% (estimated)	<15%
Wasted cost per 1,000 API calls	$0.38	$1.13
Session cost overhead (~200 calls)	$0.08	$0.23
At 100 customers x 5 sessions/day	$38.50/day	$115/day

Evidence

Accuracy degradation is catastrophic, not gradual:

Tool Count	Selection Accuracy	Source
10-20	90-95%	Berkeley BFCL v4
50	84-90%	Speakeasy
100	65-75%	ToolLLM
770 (CODITECT)	~20-40%	Extrapolated

Anthropic's own defer_loading beta improved Opus 4 accuracy from 49% to 74% (+25%) by reducing the visible tool set. GitHub explicitly reduced from 40+ to 13 core tools for better performance at scale.

Industry Consensus

Comprehensive research (30+ sources, 5 academic papers, 8 industry case studies) confirms upfront loading of all tool descriptions is an anti-pattern at scale. Progressive disclosure + semantic routing is the industry standard for systems with 100+ tools.

Full analysis: internal/analysis/component-scaling/component-token-scaling-analysis.md

Decision

Adopt a three-tier Progressive Component Disclosure architecture that reduces agent description tokens from ~25,600 to ~800 (97% reduction) while maintaining full access to all 770+ agents through MCP-based discovery.

Architecture

Tier 0: Core Agents (Always in .claude/agents/)
  ~25 agents | ~800 tokens | Zero latency
  Built-in: Explore, Plan, Bash, general-purpose
  CODITECT: senior-architect, testing-specialist, code-reviewer,
            debugger, devops-engineer, database-architect, etc.

Tier 1: Track Index (Loaded via MCP Tool on Discovery)
  38 track summaries | ~600 tokens when queried | <50ms latency
  "Track A - Backend API: 15 agents for REST, GraphQL, DB design"
  Queried by: /which command, discover_agents MCP tool, or LLM request

Tier 2: Agent Spec (Loaded on Dispatch via MCP Tool)
  Individual agent | ~50-200 tokens | <100ms latency
  Full description, tools, model, capabilities
  Loaded only when agent is about to be dispatched

Implementation

Phase 1: Directory Restructuring (Week 1)

1a. Reorganize coditect-core/ directory:

coditect-core/
├── agents/               # Tier 0: ~25 core agents (loaded by Claude Code)
│   ├── senior-architect.md
│   ├── testing-specialist.md
│   ├── code-reviewer.md
│   ├── debugger.md
│   ├── devops-engineer.md
│   ├── database-architect.md
│   └── ... (~25 files total)
├── agents-extended/      # Tier 1-2: ~745 agents (NOT auto-loaded)
│   ├── enterprise/       # PCF business agents (~500)
│   ├── specialist/       # Domain specialists (~80)
│   ├── orchestrator/     # Orchestrators (~20)
│   └── other/            # Remaining agents

Because .claude -> coditect-core, only files in coditect-core/agents/ are loaded into the Task schema. Files in agents-extended/ are invisible to Claude Code's auto-loader.

1b. Define core agent list (~25 agents) based on session universality:

# config/core-agents.yaml
core_agents:
  builtin:
    - Explore
    - Plan
    - Bash
    - general-purpose
  coditect:
    - senior-architect
    - testing-specialist
    - code-reviewer
    - debugger
    - devops-engineer
    - database-architect
    - security-specialist
    - backend-development
    - documentation-generation
    - ci-pipeline
    - release-manager
    - web-search-researcher
    - code-documentation
    - frontend-react-typescript-expert
    - architect-review
    - backend-api-security
    - monitoring-specialist
    - incident-responder
    - data-engineering
    - cloud-architect

1c. Move ~745 non-core agents from agents/ to agents-extended/.

1d. Update all scripts that scan agents/:

scripts/validate-agent-structure.py -- scan both directories
scripts/component-indexer.py -- index both directories
config/framework-registry.json generation -- include both
scripts/moe_classifier/ -- classify from both

1e. Verify /agent command continues to work -- it uses invoke-agent.py which reads agent files by name, so it needs updated path resolution to check agents-extended/ as fallback.

Phase 2: MCP Discovery Tools (Weeks 2-3)

2a. Add two MCP tools on the coditect-call-graph server (or new coditect-agent-router server):

# Tool 1: discover_agents
def discover_agents(query: str, track: str = None, limit: int = 5) -> list:
    """Search agent registry by keyword, semantic similarity, or track.
    Searches both agents/ and agents-extended/ directories.
    Returns: [{name, description, track, capabilities, tier}]
    """

# Tool 2: get_agent_spec
def get_agent_spec(agent_name: str) -> dict:
    """Load full agent specification for dispatch.
    Returns: {name, description, tools, model, domain, full_prompt}
    """

2b. Discovery workflow:

Claude recognizes it needs a specialized agent not in core set
Claude calls discover_agents(query="kubernetes troubleshooting")
MCP tool returns top-5 matching agents from agents-extended/
Claude calls get_agent_spec("kubernetes-troubleshooter")
Claude dispatches via Task(subagent_type="general-purpose", prompt="{full_prompt}") proxy pattern

2c. Enhance /which command to use discover_agents MCP tool.

Phase 3: Semantic Router (Weeks 3-4)

3a. Pre-compute embeddings for all agent descriptions:

Input: agents/*.md and agents-extended/*.md frontmatter
Storage: agent_embeddings table in platform.db
Model: sentence-transformers locally (fast, no API cost)

3b. Enhance discover_agents() with cosine similarity search.

Phase 4: Anthropic API Integration (When Stable)

4a. Adopt Anthropic's Tool Search Tool with defer_loading: true for core agents.

4b. This is additive -- combine with directory restructuring for maximum benefit.

Token Budget

Tier	When Loaded	Tokens	Latency
Tier 0 (core agents in .claude/agents/)	Every API call	~800	0ms
Tier 1 (track index via MCP)	On agent search	~600	<50ms
Tier 2 (agent spec via MCP)	On agent dispatch	~200	<100ms
Typical session		~1,600	<150ms
Current		~25,600	0ms

Net savings: ~24,800 tokens/call (97% reduction for Tier 0 alone)

Leveraged Existing Infrastructure

Component	How Used
`agent_registry.py`	Capability-based discovery (AgentRegistry class)
`track_registry.py`	38-track taxonomy for Tier 1 index
`intelligent_track_mapper.py`	Semantic keyword matching
`classify.py`	MoE classification confidence scores
`validate-agent-structure.py`	Ensure all agents have required frontmatter
`component-indexer.py`	Searchable component index
`invoke-agent.py`	Proxy dispatch pattern for non-core agents
Dynamic Capability Router skill	Intent-based routing decision tree

Product Considerations

Concern	Approach
Customer transparency	Core agents just work; discovery is seamless via MCP
Customer agent additions	Customers add to `agents/` (core) or `agents-extended/` (discoverable)
Enterprise configurability	`config/core-agents.yaml` is overridable per installation
Backward compatibility	`/agent` command unchanged; proxy pattern for extended agents
Multi-tenant scaling	Per-customer `core-agents.yaml` enables tenant-specific core sets
Plugin/marketplace agents	Plugins add to `agents-extended/` with track metadata for discovery

Consequences

Positive

97% token reduction in system prompt (25,600 -> ~800 for Tier 0)
Dramatic accuracy improvement -- smaller tool set means correct agent selection (per Anthropic and GitHub data)
Scales to 10,000+ agents without system prompt growth
Reduced cost -- $0.36/1,000 calls saved at current scale; $14K/year at 100 customers
Leverages existing infrastructure -- MoE, track registry, agent registry all reused
Backward compatible -- core agents work identically; proxy pattern for extended agents
Competitive parity -- matches GitHub Copilot's proven core-tool approach

Negative

Extra round-trip for non-core agents -- 50-150ms latency for MCP discovery+load
Directory restructuring -- one-time migration of 745 agent files; potential for broken references
MCP dependency -- full agent access requires MCP server running
Maintenance burden -- core agent list must be curated as usage patterns evolve
Embedding computation -- one-time cost for Phase 3

Neutral

Agent files (agents/*.md, agents-extended/*.md) format unchanged
Existing /agent command continues via proxy pattern
/which command enhanced but not broken
Component counts in config/component-counts.json still reflect all agents

Alternatives Considered

Alternative	Token Reduction	Why Not Selected
Prune to <100 agents	87%	Loses capability permanently; doesn't address growth
Compress descriptions only	40-46%	Insufficient; still 14k+ tokens at current scale
Full MCP agent server	99%	Over-engineered for Phase 1; adds infrastructure complexity
Semantic routing only	95%	Requires embedding infra upfront; directory restructure is simpler first step
Do nothing	0%	75k+ tokens by Q3 2026; catastrophic accuracy degradation

Compliance

ADR-003 (Agent System): Extended, not replaced. Agent definition format unchanged. New agents-extended/ directory follows same conventions.
ADR-010 (Component Capability Schema): Leveraged for capability-based MCP discovery.
ADR-026 (Intent Classification): Semantic router in Phase 3 builds on intent classification.
ADR-052 (Intent-Aware Context): Progressive disclosure is context-aware by design.

Open Questions

Usage analytics needed: Which ~25 agents should be in the core set? Need telemetry data from real sessions (proposed: PostToolUse hook logging subagent_type).
Customer agent placement: Should customer-added agents go to agents/ (core) or agents-extended/ (discoverable) by default?
Anthropic API timeline: When will defer_loading and Tool Search Tool be GA? Phase 4 depends on stability.

Review

Architecture Review: Hal Casteel
Implementation: Track H (Framework)
Testing: Benchmark suite for agent selection accuracy at 770 vs. 25 agents
Monitoring: Token counter hook for system prompt size
Migration Plan: Script to safely move 745 agents to agents-extended/

Author: Claude (Opus 4.6) Date: 2026-02-07 (Updated)

Status​

Context​

The Product Problem​

Loading Mechanism​

Dual Dispatch Mechanisms​

Impact Quantified​

Evidence​

Industry Consensus​

Decision​

Architecture​

Implementation​

Phase 1: Directory Restructuring (Week 1)​

Phase 2: MCP Discovery Tools (Weeks 2-3)​

Phase 3: Semantic Router (Weeks 3-4)​

Phase 4: Anthropic API Integration (When Stable)​

Token Budget​

Leveraged Existing Infrastructure​

Product Considerations​

Consequences​

Positive​

Negative​

Neutral​

Alternatives Considered​

Compliance​

Open Questions​

Review​