Skip to main content

Glossary: Brainqub3 Agent Labs × CODITECT Integration

A

Agent (CODITECT)

AI agent definition within the CODITECT platform. CODITECT includes 776 agents, each invoked via the /agent command. Example: /agent senior-architect "design task".

Agent Backend

Class handling Claude Agent SDK loading, live/mock mode, streaming, and tool policy enforcement. Responsible for the runtime environment where agents execute.

Aggregation

The process of combining multiple agent outputs into a single final answer. Common methods include majority vote (most common output) and orchestrator synthesis (intelligent combination).

Arena

The experiment execution environment where agent runs are conducted. Provides the runtime infrastructure for both single-agent and multi-agent experiments.

Architecture Type

The coordination pattern used in a multi-agent system. Brainqub3 defines five types: Independent (parallel workers + majority vote), Centralised (workers + orchestrator synthesis), Decentralised (peer exchange + consensus vote), Hybrid (peer rounds + orchestrator synthesis), and Single-Agent System (SAS baseline).

architectures.yaml

Configuration file defining the five architecture types with their default parameters, including agent counts, tool policies, and coordination behavior.

B

BaseOrchestrator

Abstract base class for all orchestrator implementations. Defines the interface with step(), terminate(), and run() methods that concrete orchestrators must implement.

Batch

Group of runs sharing a common batch_id for comparison or elasticity analysis. Batches enable controlled experiments where one variable changes while others remain constant.

beta0 (Intercept)

Base performance level in the mixed-effects model from the paper (arXiv:2512.08296 Table 4). Represents expected performance before any predictors contribute.

C

Centralised Architecture

Multi-agent pattern where workers produce independent draft solutions, then an orchestrator synthesizes them into a final answer. No peer communication between workers.

Claude Agent SDK

Official claude-agent-sdk Python package for live agent execution. Brainqub3 wraps this SDK to provide experiment infrastructure and deterministic evaluation.

Clipping

Constraining predicted performance values to the valid range [0,1]. Applied to both P_hat (overall prediction) and individual coordination metrics.

Command (CODITECT)

Slash command within the CODITECT framework (377 total). Invoked via /command-name syntax. Examples: /orient, /session-log, /which.

Coordination Collapse

When adding more agents degrades performance due to coordination overhead exceeding the benefit. Characterized by negative elasticity (eta_n < 0) or P_MAS < P_SAS.

Coordination Efficiency (E_c)

success_rate / (turns_total / turns_sas) — measures how efficiently the multi-agent system uses turns relative to single-agent baseline. Higher values indicate better coordination.

Coordination Metrics

Five derived metrics capturing multi-agent coordination behavior: overhead_pct, message_density_c, redundancy_R, efficiency_Ec, error_amp_Ae. Computed post-run and stored in derived_metrics.json.

D

Dashboard

Custom HTML web UI (Python ThreadingHTTPServer) for run comparison, scaling visualization, and scenario analysis. Located at dashboard/webapp.py. Includes tabs for Scaling Laws, Run Comparison, and Scenario Engine.

Decentralised Architecture

Multi-agent pattern where workers propose initial solutions, exchange peer messages to refine, then reach consensus via majority vote. No central orchestrator synthesis.

Delta vs SAS

P_architecture - P_SAS — the benefit or cost of a multi-agent architecture relative to the single-agent baseline. Positive values indicate MAS improvement.

Derived Metrics

Post-run computed metrics (overhead, efficiency, redundancy, etc.) stored in derived_metrics.json. Calculated from completed run data rather than measured during execution.

E

Elasticity (eta_n, eta_T)

Power-law exponents measuring how coordination metrics scale with agents (eta_n) or tools (eta_T). Estimated from batch data via log-log regression.

Error Amplification (A_e)

(1 - success_mas) / (1 - success_sas) — measures how multi-agent coordination amplifies single-agent errors. Values > 1 indicate error amplification; < 1 indicate error mitigation.

Evaluator

Deterministic scoring function that assesses agent output quality. Must validate successfully and pass tests before any run executes (evaluator-first rule).

Evaluator-First

Hard rule: no run without validated evaluator and passing tests. Ensures all experiments have objective, reproducible scoring from the outset.

ExperimentRunner

Class managing the run lifecycle from evaluator validation through finalization. Handles agent execution, metric collection, manifest creation, and run archival.

G

Global Estimate

Fallback parameter estimate (e.g., elasticity) computed from all batches when batch-specific data is insufficient. Used in shrinkage blend for low-support scenarios.

H

Hook (CODITECT)

Event trigger within the CODITECT framework (118 total). Fires on tool use and lifecycle events. Example: task_id_validator.py hooks PreToolUse events.

Hybrid Architecture

Multi-agent pattern combining peer exchange (decentralised) with orchestrator synthesis (centralised). Workers conduct peer rounds, then orchestrator produces final answer.

Hyperparameter

Configuration setting for an architecture type (e.g., n_agents, temperature, max_turns). Defined in architectures.yaml and overridable per run.

I

I_centered

intelligence_index - 56.9 — centered intelligence signal used in the mixed-effects model. The centering constant 56.9 comes from the paper's dataset.

Independent Architecture

Multi-agent pattern where workers solve in parallel with no communication, then outputs are aggregated by majority vote. Minimal coordination overhead.

Instance

A specific task input/expected-output pair used in experiments. Identified by instance_id in the run manifest.

Intelligence Index (I)

Composite model capability signal from Artificial Analysis Intelligence Index. Ranges roughly 0-100, with GPT-4 Turbo at ~74 and Claude 3.7 Opus at ~87.

Inter-Agent Message

Communication between peer agents (non-orchestrator). Counted separately from orchestrator messages. Used to compute message_density_c.

Interaction Terms

Product terms between standardised predictors in the mixed-effects model. Table 4 includes 9 interaction terms (e.g., I_centered * overhead_pct_std).

Intercept

See beta0 (Intercept).

L

Live Mode

Real API calls to Claude via the official SDK. Default execution mode. Enabled by --require-live flag or absence of --allow-mock.

M

Majority Vote

Aggregation method that selects the most common worker output as the final answer. Used in Independent and Decentralised architectures.

MAS (Multi-Agent System)

Multiple agents collaborating on a task. Contrasts with Single-Agent System (SAS) baseline.

Message Density (message_density_c)

inter_agent_messages / (inter_agent_messages + turns_total) — communication intensity metric ranging [0,1]. Higher values indicate more peer communication relative to total execution.

Mixed-Effects Model

The paper-aligned scaling model formulation from Table 4 of arXiv:2512.08296. Predicts P_hat using intelligence index, coordination metrics, and interaction terms.

Mock Mode

Deterministic fallback for offline testing. Returns canned responses without API calls. Enabled by --allow-mock flag.

MoE (Mixture of Experts)

CODITECT's multi-agent classification and routing system. Analyzes tasks and routes them to the most appropriate specialist agents. Example: /which <task> finds best agent.

models.yaml

Model configuration file mapping model names to intelligence_index values and other metadata. Used by the scaling model predictor.

N

n_agents

Number of agents participating in a multi-agent architecture. Key scaling dimension (along with tool count T).

O

Orchestrator

Central coordination agent in Centralised and Hybrid architectures. Synthesizes worker outputs into final answer. Does not exist in Independent or Decentralised patterns.

Orchestrator Context

Dataclass providing execute_turn, record_inter_agent, and record_orchestrator callbacks to orchestrator implementations. Enables orchestrators to interact with the experiment runtime.

Orchestrator Synthesis

Centralised step where the orchestrator combines worker outputs into a final answer. Uses a synthesis prompt template defined in the architecture configuration.

Overhead (overhead_pct)

max(0, ((turns_mas - turns_sas) / turns_sas) * 100) — coordination overhead percentage. Measures extra turns required by MAS relative to SAS baseline.

P

P_hat

Predicted architecture performance from the mixed-effects model, clipped to [0,1]. Represents the estimated success probability for a given configuration.

P_SA (P_SAS)

Single-agent baseline performance. Acts as a task difficulty proxy: harder tasks have lower P_SA values.

paper_coefficients.json

Table 4 coefficients for the scaling model. Includes intercept, main effects, and interaction terms from arXiv:2512.08296.

Peer Exchange

Decentralised coordination round where agents share proposals and refine their solutions based on peer feedback. No central orchestrator involvement.

PredictionResult

Dataclass containing predicted_P, linear_output, transformed/standardised predictors, and per-term contributions. Returned by the scaling model predictor.

R

Redundancy (redundancy_R)

Mean pairwise cosine similarity of worker outputs (TF-IDF vectors). Measures work overlap ranging [0,1]. Higher values indicate duplicated effort.

Run

A single experiment execution (SAS or MAS) on a task instance. Produces a run manifest and derived metrics upon completion.

Run Manifest

Content-hashed JSON file that makes completed runs immutable. Contains all run configuration, results, and metadata. Stored at runs/{architecture}/{batch_id}/{run_id}.json.

S

SAS (Single-Agent System)

One agent solving a task independently. Baseline for all multi-agent comparisons. Used to compute overhead, efficiency, and error amplification.

Scaling Layer

x_hat = clamp(x_base * (n_agents/n0)^eta_n * (T/T0)^eta_T) — power-law transformation applied to coordination metrics to model scaling behavior.

Scaling Laws Tab

Dashboard view showing coordination metric behavior across scale (varying n_agents or tool count). Visualizes elasticity estimates and predictions.

Scenario Engine

What-if analysis tool for hypothetical scaling scenarios. Allows exploration of performance under different agent counts, models, and architectures.

Shrinkage Blend

Weighted average between batch-specific and global elasticity estimates. Used when batch-specific data has low support (few runs).

Skill (CODITECT)

Reusable capability definition within CODITECT (445 total). Each has a SKILL.md file with frontmatter including track: and cef_track: fields.

standardisation.json

Default z-score means and standard deviations for predictor transforms. Used when batch-specific statistics are unavailable.

HTML Dashboard

Custom web UI built with Python's ThreadingHTTPServer for run comparison, scaling visualization, and scenario analysis. Entry point: dashboard/webapp.py.

Success Rate

Binary or continuous measure of agent output quality. Computed by the evaluator. Primary outcome metric for all experiments.

T

T (Tool Count)

Number of tools available to agents. Key scaling dimension (along with n_agents).

Tool Policy

Allowed tools enforced at runtime via SDK permission callbacks. Not metadata-only—actively prevents unauthorized tool use during execution.

Track (CODITECT)

Project management unit within CODITECT (37 tracks: A-N for core platform, O-AA for PCF business, AB-AK for extensions). Example: Track H = Framework.

Turn

A single agent response cycle. Includes prompt, model inference, and response (text + tool calls).

TurnResult

Dataclass containing text, token_usage, token_cost_usd, and tool_calls from a single agent turn.

turns_total

Total turns across all agents in a run. Used to compute overhead and efficiency metrics.

turns_sas

Turns required by single-agent baseline. Denominator for overhead and efficiency calculations.

Z

Z-score / Standardisation

(x_i - mean_i) / std_i transformation applied to predictors before coefficient multiplication. Ensures predictors are on comparable scales.


Cross-References

Architecture Type → Aggregation Method

  • Independent: Majority Vote
  • Centralised: Orchestrator Synthesis
  • Decentralised: Majority Vote (after Peer Exchange)
  • Hybrid: Orchestrator Synthesis (after Peer Exchange)
  • SAS: N/A (single output)

Coordination Metrics → Formula

  • overhead_pct: max(0, ((turns_mas - turns_sas) / turns_sas) * 100)
  • message_density_c: inter_agent_messages / (inter_agent_messages + turns_total)
  • redundancy_R: Mean pairwise cosine TF-IDF similarity
  • efficiency_Ec: success_rate / (turns_total / turns_sas)
  • error_amp_Ae: (1 - success_mas) / (1 - success_sas)

Elasticity → Scaling Dimension

  • eta_n: Elasticity with respect to number of agents (n_agents)
  • eta_T: Elasticity with respect to tool count (T)

CODITECT Components → Count

  • Agents: 776
  • Skills: 445
  • Commands: 377
  • Hooks: 118
  • Tracks: 37 (A-N core, O-AA business, AB-AK extensions)

Version: 1.0 Date: 2026-02-16 Source: Brainqub3 Agent Labs codebase + arXiv:2512.08296 Integration: CODITECT Platform Framework Authors: Claude (Sonnet 4.5) via documentation-generation agent