Glossary: Brainqub3 Agent Labs × CODITECT Integration
A
Agent (CODITECT)
AI agent definition within the CODITECT platform. CODITECT includes 776 agents, each invoked via the /agent command. Example: /agent senior-architect "design task".
Agent Backend
Class handling Claude Agent SDK loading, live/mock mode, streaming, and tool policy enforcement. Responsible for the runtime environment where agents execute.
Aggregation
The process of combining multiple agent outputs into a single final answer. Common methods include majority vote (most common output) and orchestrator synthesis (intelligent combination).
Arena
The experiment execution environment where agent runs are conducted. Provides the runtime infrastructure for both single-agent and multi-agent experiments.
Architecture Type
The coordination pattern used in a multi-agent system. Brainqub3 defines five types: Independent (parallel workers + majority vote), Centralised (workers + orchestrator synthesis), Decentralised (peer exchange + consensus vote), Hybrid (peer rounds + orchestrator synthesis), and Single-Agent System (SAS baseline).
architectures.yaml
Configuration file defining the five architecture types with their default parameters, including agent counts, tool policies, and coordination behavior.
B
BaseOrchestrator
Abstract base class for all orchestrator implementations. Defines the interface with step(), terminate(), and run() methods that concrete orchestrators must implement.
Batch
Group of runs sharing a common batch_id for comparison or elasticity analysis. Batches enable controlled experiments where one variable changes while others remain constant.
beta0 (Intercept)
Base performance level in the mixed-effects model from the paper (arXiv:2512.08296 Table 4). Represents expected performance before any predictors contribute.
C
Centralised Architecture
Multi-agent pattern where workers produce independent draft solutions, then an orchestrator synthesizes them into a final answer. No peer communication between workers.
Claude Agent SDK
Official claude-agent-sdk Python package for live agent execution. Brainqub3 wraps this SDK to provide experiment infrastructure and deterministic evaluation.
Clipping
Constraining predicted performance values to the valid range [0,1]. Applied to both P_hat (overall prediction) and individual coordination metrics.
Command (CODITECT)
Slash command within the CODITECT framework (377 total). Invoked via /command-name syntax. Examples: /orient, /session-log, /which.
Coordination Collapse
When adding more agents degrades performance due to coordination overhead exceeding the benefit. Characterized by negative elasticity (eta_n < 0) or P_MAS < P_SAS.
Coordination Efficiency (E_c)
success_rate / (turns_total / turns_sas) — measures how efficiently the multi-agent system uses turns relative to single-agent baseline. Higher values indicate better coordination.
Coordination Metrics
Five derived metrics capturing multi-agent coordination behavior: overhead_pct, message_density_c, redundancy_R, efficiency_Ec, error_amp_Ae. Computed post-run and stored in derived_metrics.json.
D
Dashboard
Custom HTML web UI (Python ThreadingHTTPServer) for run comparison, scaling visualization, and scenario analysis. Located at dashboard/webapp.py. Includes tabs for Scaling Laws, Run Comparison, and Scenario Engine.
Decentralised Architecture
Multi-agent pattern where workers propose initial solutions, exchange peer messages to refine, then reach consensus via majority vote. No central orchestrator synthesis.
Delta vs SAS
P_architecture - P_SAS — the benefit or cost of a multi-agent architecture relative to the single-agent baseline. Positive values indicate MAS improvement.
Derived Metrics
Post-run computed metrics (overhead, efficiency, redundancy, etc.) stored in derived_metrics.json. Calculated from completed run data rather than measured during execution.
E
Elasticity (eta_n, eta_T)
Power-law exponents measuring how coordination metrics scale with agents (eta_n) or tools (eta_T). Estimated from batch data via log-log regression.
Error Amplification (A_e)
(1 - success_mas) / (1 - success_sas) — measures how multi-agent coordination amplifies single-agent errors. Values > 1 indicate error amplification; < 1 indicate error mitigation.
Evaluator
Deterministic scoring function that assesses agent output quality. Must validate successfully and pass tests before any run executes (evaluator-first rule).
Evaluator-First
Hard rule: no run without validated evaluator and passing tests. Ensures all experiments have objective, reproducible scoring from the outset.
ExperimentRunner
Class managing the run lifecycle from evaluator validation through finalization. Handles agent execution, metric collection, manifest creation, and run archival.
G
Global Estimate
Fallback parameter estimate (e.g., elasticity) computed from all batches when batch-specific data is insufficient. Used in shrinkage blend for low-support scenarios.
H
Hook (CODITECT)
Event trigger within the CODITECT framework (118 total). Fires on tool use and lifecycle events. Example: task_id_validator.py hooks PreToolUse events.
Hybrid Architecture
Multi-agent pattern combining peer exchange (decentralised) with orchestrator synthesis (centralised). Workers conduct peer rounds, then orchestrator produces final answer.
Hyperparameter
Configuration setting for an architecture type (e.g., n_agents, temperature, max_turns). Defined in architectures.yaml and overridable per run.
I
I_centered
intelligence_index - 56.9 — centered intelligence signal used in the mixed-effects model. The centering constant 56.9 comes from the paper's dataset.
Independent Architecture
Multi-agent pattern where workers solve in parallel with no communication, then outputs are aggregated by majority vote. Minimal coordination overhead.
Instance
A specific task input/expected-output pair used in experiments. Identified by instance_id in the run manifest.
Intelligence Index (I)
Composite model capability signal from Artificial Analysis Intelligence Index. Ranges roughly 0-100, with GPT-4 Turbo at ~74 and Claude 3.7 Opus at ~87.
Inter-Agent Message
Communication between peer agents (non-orchestrator). Counted separately from orchestrator messages. Used to compute message_density_c.
Interaction Terms
Product terms between standardised predictors in the mixed-effects model. Table 4 includes 9 interaction terms (e.g., I_centered * overhead_pct_std).
Intercept
See beta0 (Intercept).
L
Live Mode
Real API calls to Claude via the official SDK. Default execution mode. Enabled by --require-live flag or absence of --allow-mock.
M
Majority Vote
Aggregation method that selects the most common worker output as the final answer. Used in Independent and Decentralised architectures.
MAS (Multi-Agent System)
Multiple agents collaborating on a task. Contrasts with Single-Agent System (SAS) baseline.
Message Density (message_density_c)
inter_agent_messages / (inter_agent_messages + turns_total) — communication intensity metric ranging [0,1]. Higher values indicate more peer communication relative to total execution.
Mixed-Effects Model
The paper-aligned scaling model formulation from Table 4 of arXiv:2512.08296. Predicts P_hat using intelligence index, coordination metrics, and interaction terms.
Mock Mode
Deterministic fallback for offline testing. Returns canned responses without API calls. Enabled by --allow-mock flag.
MoE (Mixture of Experts)
CODITECT's multi-agent classification and routing system. Analyzes tasks and routes them to the most appropriate specialist agents. Example: /which <task> finds best agent.
models.yaml
Model configuration file mapping model names to intelligence_index values and other metadata. Used by the scaling model predictor.
N
n_agents
Number of agents participating in a multi-agent architecture. Key scaling dimension (along with tool count T).
O
Orchestrator
Central coordination agent in Centralised and Hybrid architectures. Synthesizes worker outputs into final answer. Does not exist in Independent or Decentralised patterns.
Orchestrator Context
Dataclass providing execute_turn, record_inter_agent, and record_orchestrator callbacks to orchestrator implementations. Enables orchestrators to interact with the experiment runtime.
Orchestrator Synthesis
Centralised step where the orchestrator combines worker outputs into a final answer. Uses a synthesis prompt template defined in the architecture configuration.
Overhead (overhead_pct)
max(0, ((turns_mas - turns_sas) / turns_sas) * 100) — coordination overhead percentage. Measures extra turns required by MAS relative to SAS baseline.
P
P_hat
Predicted architecture performance from the mixed-effects model, clipped to [0,1]. Represents the estimated success probability for a given configuration.
P_SA (P_SAS)
Single-agent baseline performance. Acts as a task difficulty proxy: harder tasks have lower P_SA values.
paper_coefficients.json
Table 4 coefficients for the scaling model. Includes intercept, main effects, and interaction terms from arXiv:2512.08296.
Peer Exchange
Decentralised coordination round where agents share proposals and refine their solutions based on peer feedback. No central orchestrator involvement.
PredictionResult
Dataclass containing predicted_P, linear_output, transformed/standardised predictors, and per-term contributions. Returned by the scaling model predictor.
R
Redundancy (redundancy_R)
Mean pairwise cosine similarity of worker outputs (TF-IDF vectors). Measures work overlap ranging [0,1]. Higher values indicate duplicated effort.
Run
A single experiment execution (SAS or MAS) on a task instance. Produces a run manifest and derived metrics upon completion.
Run Manifest
Content-hashed JSON file that makes completed runs immutable. Contains all run configuration, results, and metadata. Stored at runs/{architecture}/{batch_id}/{run_id}.json.
S
SAS (Single-Agent System)
One agent solving a task independently. Baseline for all multi-agent comparisons. Used to compute overhead, efficiency, and error amplification.
Scaling Layer
x_hat = clamp(x_base * (n_agents/n0)^eta_n * (T/T0)^eta_T) — power-law transformation applied to coordination metrics to model scaling behavior.
Scaling Laws Tab
Dashboard view showing coordination metric behavior across scale (varying n_agents or tool count). Visualizes elasticity estimates and predictions.
Scenario Engine
What-if analysis tool for hypothetical scaling scenarios. Allows exploration of performance under different agent counts, models, and architectures.
Shrinkage Blend
Weighted average between batch-specific and global elasticity estimates. Used when batch-specific data has low support (few runs).
Skill (CODITECT)
Reusable capability definition within CODITECT (445 total). Each has a SKILL.md file with frontmatter including track: and cef_track: fields.
standardisation.json
Default z-score means and standard deviations for predictor transforms. Used when batch-specific statistics are unavailable.
HTML Dashboard
Custom web UI built with Python's ThreadingHTTPServer for run comparison, scaling visualization, and scenario analysis. Entry point: dashboard/webapp.py.
Success Rate
Binary or continuous measure of agent output quality. Computed by the evaluator. Primary outcome metric for all experiments.
T
T (Tool Count)
Number of tools available to agents. Key scaling dimension (along with n_agents).
Tool Policy
Allowed tools enforced at runtime via SDK permission callbacks. Not metadata-only—actively prevents unauthorized tool use during execution.
Track (CODITECT)
Project management unit within CODITECT (37 tracks: A-N for core platform, O-AA for PCF business, AB-AK for extensions). Example: Track H = Framework.
Turn
A single agent response cycle. Includes prompt, model inference, and response (text + tool calls).
TurnResult
Dataclass containing text, token_usage, token_cost_usd, and tool_calls from a single agent turn.
turns_total
Total turns across all agents in a run. Used to compute overhead and efficiency metrics.
turns_sas
Turns required by single-agent baseline. Denominator for overhead and efficiency calculations.
Z
Z-score / Standardisation
(x_i - mean_i) / std_i transformation applied to predictors before coefficient multiplication. Ensures predictors are on comparable scales.
Cross-References
Architecture Type → Aggregation Method
- Independent: Majority Vote
- Centralised: Orchestrator Synthesis
- Decentralised: Majority Vote (after Peer Exchange)
- Hybrid: Orchestrator Synthesis (after Peer Exchange)
- SAS: N/A (single output)
Coordination Metrics → Formula
- overhead_pct:
max(0, ((turns_mas - turns_sas) / turns_sas) * 100) - message_density_c:
inter_agent_messages / (inter_agent_messages + turns_total) - redundancy_R: Mean pairwise cosine TF-IDF similarity
- efficiency_Ec:
success_rate / (turns_total / turns_sas) - error_amp_Ae:
(1 - success_mas) / (1 - success_sas)
Elasticity → Scaling Dimension
- eta_n: Elasticity with respect to number of agents (n_agents)
- eta_T: Elasticity with respect to tool count (T)
CODITECT Components → Count
- Agents: 776
- Skills: 445
- Commands: 377
- Hooks: 118
- Tracks: 37 (A-N core, O-AA business, AB-AK extensions)
Version: 1.0 Date: 2026-02-16 Source: Brainqub3 Agent Labs codebase + arXiv:2512.08296 Integration: CODITECT Platform Framework Authors: Claude (Sonnet 4.5) via documentation-generation agent