Skip to main content

ADR-162: Progressive Component Disclosure Architecture

Status

PROPOSED (2026-02-07)

Context

The Product Problem

CODITECT is a product framework delivered to customers. Each installation includes 3,392 components (770 agents, 431 skills, 364 commands). Claude Code loads all agent descriptions into the Task tool's subagent_type schema on every API call, consuming ~25.6k tokens -- 70% above the recommended 15k budget.

This is a product quality and scalability issue:

  • Every customer session pays the full 25.6k token overhead
  • Agent selection accuracy degrades catastrophically at 770 agents (estimated 20-40% accuracy vs. 90%+ at 25 agents)
  • Context window available for customer work is reduced by 13% (growing to 37% by Q3 2026)
  • Competitive disadvantage vs. GitHub Copilot (13 core tools), Cursor (RAG-based), and other tools using progressive disclosure

Loading Mechanism

Claude Code loads custom agents from the .claude/agents/ directory. In CODITECT:

.claude -> .coditect -> submodules/core/coditect-core/

All 770 agent .md files in coditect-core/agents/ are loaded at session start. Claude Code extracts name + description from frontmatter into the Task tool's function schema (~33 tokens/agent x 770 = ~25.6k tokens). The full agent body (~589 tokens avg) is loaded only on dispatch.

There is no configuration mechanism for selective loading. All .md files in .claude/agents/ are loaded unconditionally.

Dual Dispatch Mechanisms

CODITECT has two parallel agent dispatch paths:

  1. Claude Code native: Task(subagent_type="agent-name", prompt="...") -- directly from Task schema
  2. CODITECT proxy: Task(subagent_type="general-purpose", prompt="...") via invoke-agent.py -- injects full agent system prompt

Both coexist. The proxy pattern provides flexibility but does not reduce schema cost.

Impact Quantified

MetricCurrentAt Scale (Q3 2026)
Agent description tokens25,60075,000+
Context consumed (of 200k)13%37%
Agent selection accuracy~20-40% (estimated)<15%
Wasted cost per 1,000 API calls$0.38$1.13
Session cost overhead (~200 calls)$0.08$0.23
At 100 customers x 5 sessions/day$38.50/day$115/day

Evidence

Accuracy degradation is catastrophic, not gradual:

Tool CountSelection AccuracySource
10-2090-95%Berkeley BFCL v4
5084-90%Speakeasy
10065-75%ToolLLM
770 (CODITECT)~20-40%Extrapolated

Anthropic's own defer_loading beta improved Opus 4 accuracy from 49% to 74% (+25%) by reducing the visible tool set. GitHub explicitly reduced from 40+ to 13 core tools for better performance at scale.

Industry Consensus

Comprehensive research (30+ sources, 5 academic papers, 8 industry case studies) confirms upfront loading of all tool descriptions is an anti-pattern at scale. Progressive disclosure + semantic routing is the industry standard for systems with 100+ tools.

Full analysis: internal/analysis/component-scaling/component-token-scaling-analysis.md

Decision

Adopt a three-tier Progressive Component Disclosure architecture that reduces agent description tokens from ~25,600 to ~800 (97% reduction) while maintaining full access to all 770+ agents through MCP-based discovery.

Architecture

Tier 0: Core Agents (Always in .claude/agents/)
~25 agents | ~800 tokens | Zero latency
Built-in: Explore, Plan, Bash, general-purpose
CODITECT: senior-architect, testing-specialist, code-reviewer,
debugger, devops-engineer, database-architect, etc.

Tier 1: Track Index (Loaded via MCP Tool on Discovery)
38 track summaries | ~600 tokens when queried | <50ms latency
"Track A - Backend API: 15 agents for REST, GraphQL, DB design"
Queried by: /which command, discover_agents MCP tool, or LLM request

Tier 2: Agent Spec (Loaded on Dispatch via MCP Tool)
Individual agent | ~50-200 tokens | <100ms latency
Full description, tools, model, capabilities
Loaded only when agent is about to be dispatched

Implementation

Phase 1: Directory Restructuring (Week 1)

1a. Reorganize coditect-core/ directory:

coditect-core/
├── agents/ # Tier 0: ~25 core agents (loaded by Claude Code)
│ ├── senior-architect.md
│ ├── testing-specialist.md
│ ├── code-reviewer.md
│ ├── debugger.md
│ ├── devops-engineer.md
│ ├── database-architect.md
│ └── ... (~25 files total)
├── agents-extended/ # Tier 1-2: ~745 agents (NOT auto-loaded)
│ ├── enterprise/ # PCF business agents (~500)
│ ├── specialist/ # Domain specialists (~80)
│ ├── orchestrator/ # Orchestrators (~20)
│ └── other/ # Remaining agents

Because .claude -> coditect-core, only files in coditect-core/agents/ are loaded into the Task schema. Files in agents-extended/ are invisible to Claude Code's auto-loader.

1b. Define core agent list (~25 agents) based on session universality:

# config/core-agents.yaml
core_agents:
builtin:
- Explore
- Plan
- Bash
- general-purpose
coditect:
- senior-architect
- testing-specialist
- code-reviewer
- debugger
- devops-engineer
- database-architect
- security-specialist
- backend-development
- documentation-generation
- ci-pipeline
- release-manager
- web-search-researcher
- code-documentation
- frontend-react-typescript-expert
- architect-review
- backend-api-security
- monitoring-specialist
- incident-responder
- data-engineering
- cloud-architect

1c. Move ~745 non-core agents from agents/ to agents-extended/.

1d. Update all scripts that scan agents/:

  • scripts/validate-agent-structure.py -- scan both directories
  • scripts/component-indexer.py -- index both directories
  • config/framework-registry.json generation -- include both
  • scripts/moe_classifier/ -- classify from both

1e. Verify /agent command continues to work -- it uses invoke-agent.py which reads agent files by name, so it needs updated path resolution to check agents-extended/ as fallback.

Phase 2: MCP Discovery Tools (Weeks 2-3)

2a. Add two MCP tools on the coditect-call-graph server (or new coditect-agent-router server):

# Tool 1: discover_agents
def discover_agents(query: str, track: str = None, limit: int = 5) -> list:
"""Search agent registry by keyword, semantic similarity, or track.
Searches both agents/ and agents-extended/ directories.
Returns: [{name, description, track, capabilities, tier}]
"""

# Tool 2: get_agent_spec
def get_agent_spec(agent_name: str) -> dict:
"""Load full agent specification for dispatch.
Returns: {name, description, tools, model, domain, full_prompt}
"""

2b. Discovery workflow:

  1. Claude recognizes it needs a specialized agent not in core set
  2. Claude calls discover_agents(query="kubernetes troubleshooting")
  3. MCP tool returns top-5 matching agents from agents-extended/
  4. Claude calls get_agent_spec("kubernetes-troubleshooter")
  5. Claude dispatches via Task(subagent_type="general-purpose", prompt="{full_prompt}") proxy pattern

2c. Enhance /which command to use discover_agents MCP tool.

Phase 3: Semantic Router (Weeks 3-4)

3a. Pre-compute embeddings for all agent descriptions:

  • Input: agents/*.md and agents-extended/*.md frontmatter
  • Storage: agent_embeddings table in platform.db
  • Model: sentence-transformers locally (fast, no API cost)

3b. Enhance discover_agents() with cosine similarity search.

Phase 4: Anthropic API Integration (When Stable)

4a. Adopt Anthropic's Tool Search Tool with defer_loading: true for core agents.

4b. This is additive -- combine with directory restructuring for maximum benefit.

Token Budget

TierWhen LoadedTokensLatency
Tier 0 (core agents in .claude/agents/)Every API call~8000ms
Tier 1 (track index via MCP)On agent search~600<50ms
Tier 2 (agent spec via MCP)On agent dispatch~200<100ms
Typical session~1,600<150ms
Current~25,6000ms

Net savings: ~24,800 tokens/call (97% reduction for Tier 0 alone)

Leveraged Existing Infrastructure

ComponentHow Used
agent_registry.pyCapability-based discovery (AgentRegistry class)
track_registry.py38-track taxonomy for Tier 1 index
intelligent_track_mapper.pySemantic keyword matching
classify.pyMoE classification confidence scores
validate-agent-structure.pyEnsure all agents have required frontmatter
component-indexer.pySearchable component index
invoke-agent.pyProxy dispatch pattern for non-core agents
Dynamic Capability Router skillIntent-based routing decision tree

Product Considerations

ConcernApproach
Customer transparencyCore agents just work; discovery is seamless via MCP
Customer agent additionsCustomers add to agents/ (core) or agents-extended/ (discoverable)
Enterprise configurabilityconfig/core-agents.yaml is overridable per installation
Backward compatibility/agent command unchanged; proxy pattern for extended agents
Multi-tenant scalingPer-customer core-agents.yaml enables tenant-specific core sets
Plugin/marketplace agentsPlugins add to agents-extended/ with track metadata for discovery

Consequences

Positive

  • 97% token reduction in system prompt (25,600 -> ~800 for Tier 0)
  • Dramatic accuracy improvement -- smaller tool set means correct agent selection (per Anthropic and GitHub data)
  • Scales to 10,000+ agents without system prompt growth
  • Reduced cost -- $0.36/1,000 calls saved at current scale; $14K/year at 100 customers
  • Leverages existing infrastructure -- MoE, track registry, agent registry all reused
  • Backward compatible -- core agents work identically; proxy pattern for extended agents
  • Competitive parity -- matches GitHub Copilot's proven core-tool approach

Negative

  • Extra round-trip for non-core agents -- 50-150ms latency for MCP discovery+load
  • Directory restructuring -- one-time migration of 745 agent files; potential for broken references
  • MCP dependency -- full agent access requires MCP server running
  • Maintenance burden -- core agent list must be curated as usage patterns evolve
  • Embedding computation -- one-time cost for Phase 3

Neutral

  • Agent files (agents/*.md, agents-extended/*.md) format unchanged
  • Existing /agent command continues via proxy pattern
  • /which command enhanced but not broken
  • Component counts in config/component-counts.json still reflect all agents

Alternatives Considered

AlternativeToken ReductionWhy Not Selected
Prune to <100 agents87%Loses capability permanently; doesn't address growth
Compress descriptions only40-46%Insufficient; still 14k+ tokens at current scale
Full MCP agent server99%Over-engineered for Phase 1; adds infrastructure complexity
Semantic routing only95%Requires embedding infra upfront; directory restructure is simpler first step
Do nothing0%75k+ tokens by Q3 2026; catastrophic accuracy degradation

Compliance

  • ADR-003 (Agent System): Extended, not replaced. Agent definition format unchanged. New agents-extended/ directory follows same conventions.
  • ADR-010 (Component Capability Schema): Leveraged for capability-based MCP discovery.
  • ADR-026 (Intent Classification): Semantic router in Phase 3 builds on intent classification.
  • ADR-052 (Intent-Aware Context): Progressive disclosure is context-aware by design.

Open Questions

  1. Usage analytics needed: Which ~25 agents should be in the core set? Need telemetry data from real sessions (proposed: PostToolUse hook logging subagent_type).
  2. Customer agent placement: Should customer-added agents go to agents/ (core) or agents-extended/ (discoverable) by default?
  3. Anthropic API timeline: When will defer_loading and Tool Search Tool be GA? Phase 4 depends on stability.

Review

  • Architecture Review: Hal Casteel
  • Implementation: Track H (Framework)
  • Testing: Benchmark suite for agent selection accuracy at 770 vs. 25 agents
  • Monitoring: Token counter hook for system prompt size
  • Migration Plan: Script to safely move 745 agents to agents-extended/

Author: Claude (Opus 4.6) Date: 2026-02-07 (Updated)