ADR-003: Fail-Open vs Fail-Closed Security Gate

Status: Proposed Date: 2026-02-18 Deciders: CODITECT Architecture Team Source Research: ClawGuard AI Agent Security Ecosystem Evaluation (2026-02-18)

Context

The CODITECT agent security layer (ADR-001) intercepts tool calls before execution. When the security scanning subsystem itself encounters an error — pattern load failure, regex compilation error, database unavailability, timeout, unhandled exception — it must decide what to do with the tool call it was inspecting.

There are two fundamental choices:

Fail-open: Allow the tool call to proceed, log the security subsystem error, alert operators. Availability is preserved; security check is bypassed for the duration of the failure.
Fail-closed: Block the tool call, return an error to the agent, halt execution until the security subsystem recovers. Security is enforced; availability is sacrificed.

This decision has no universally correct answer — it is a context-dependent tradeoff between availability and security posture. The ClawGuard ecosystem does not address this directly:

ClawGuardian (superglue-ai) is stateless and assumes the OpenClaw host handles error recovery. Its action types (block, redact, confirm, warn, log) apply to detected threats, not to security subsystem failures.
clawguard (JaydenBeard) includes a kill switch and emergency controls for manual operator intervention, implying that the default expectation is that the monitoring layer can become unavailable without blocking agents.
ClawGuard (maxxie114) uses HMAC webhook verification with rate limiting — failures in the external webhook pathway do not block email processing, suggesting implicit fail-open preference.

CODITECT serves multiple tenant types with different requirements:

Developer/Pilot tenants: High availability expectations. A security scanner failure should not halt a developer's workflow. Fail-open with alerting is appropriate.
Enterprise/Compliance tenants: May operate under SOC 2, HIPAA, or similar frameworks where an unscanned tool call is itself a compliance violation. Fail-closed is required.
Autonomous loop contexts (Ralph Wiggum): Long-running unattended loops where a security failure could cause hours of unscanned execution if fail-open. Closer to enterprise risk profile.

The security gate must accommodate both extremes without requiring custom code per tenant.

Decision

The CODITECT security gate will fail-open by default, with fail-closed configurable per tenant via tenant configuration.

Default Behavior (Fail-Open)

When the security scanning subsystem encounters an error during a tool call interception:

The tool call is allowed to proceed.
A SECURITY_SCAN_FAILURE event is written to the session log with full error details (exception type, stack trace, pattern file causing failure if applicable).
A security alert is emitted to the operator notification channel (webhook, if configured).
The security subsystem attempts self-recovery on the next invocation (re-initialize pattern loader, re-compile regex, retry database connection).
The failed interception is auditable: the session log entry records that a scan was skipped, enabling post-hoc review.

Configurable Fail-Closed (Per Tenant)

Tenants can opt into fail-closed behavior via tenant configuration:

# config/tenants/{tenant-id}/security.yaml
security:
  fail_behavior: closed       # "open" (default) | "closed"
  fail_closed_scope:
    - pre-tool-call            # Apply fail-closed at pre-tool-call hook
    - pre-agent-start          # Apply fail-closed at agent initialization
  fail_closed_exemptions:
    - tool: read_file          # Never fail-closed on read-only tools
    - tool: list_directory
  alert_on_scan_failure: true  # Always true when fail_closed is active
  scan_timeout_ms: 500         # Max time allowed for security scan before considered failure

Ralph Wiggum Autonomous Loop Override

When an agent is running inside a Ralph Wiggum autonomous loop (/ralph-loop start), the security gate escalates to fail-closed by default regardless of tenant configuration, unless the loop was explicitly started with --security-fail-open. Rationale: unattended autonomous agents represent the highest-risk execution context, and a security scan failure in an unattended session cannot be remediated by a human operator in real time.

Scan Timeout

All security scans are subject to a configurable timeout (default: 500ms). A scan that exceeds the timeout is treated as a scan failure and follows the fail-open/fail-closed policy. The 500ms default is chosen to be well above the expected P99 scan latency for regex matching against the full pattern library, while being imperceptibly short relative to tool execution time (most tools complete in seconds to minutes).

Consequences

Positive

Developer experience preserved: The default fail-open behavior ensures that a misconfigured pattern file or a transient regex timeout does not block a developer's active session.
Compliance-ready: Enterprise tenants can opt into fail-closed behavior with granular scope control (which hooks, which tool exemptions).
Autonomous loop safety: The automatic escalation to fail-closed for Ralph Wiggum loops closes the highest-risk gap without requiring per-loop configuration.
Audit completeness: Every scan failure is logged, regardless of fail behavior. Operators have a complete record of when scans were bypassed and why.
Timeout prevents DoS: A pathological regex (catastrophic backtracking) cannot cause a security scan to block an agent indefinitely.

Negative

Fail-open default is a security gap: During a security subsystem failure in the default configuration, tool calls execute without inspection. A sophisticated attacker who can induce a scan failure gains uninspected execution. Mitigation: the alert channel notifies operators immediately; session logs are auditable.
Complexity of per-tenant configuration: Two tenants with different fail behaviors create an operational asymmetry. Support teams must understand that security behavior varies by tenant.
Autonomous loop exception adds a special case: The Ralph Wiggum escalation rule is a behavioral exception that must be documented, tested, and maintained. If the loop detection logic fails, an autonomous agent might run under the wrong fail policy.

Neutral

The JaydenBeard kill switch model (manual operator intervention) remains available as an operational escape valve in all configurations. If the security layer becomes persistently broken, an operator can invoke the kill switch to halt all agent activity while the issue is resolved — this is distinct from the automated fail behavior.

Alternatives Considered

Alternative A: Always Fail-Closed

Approach: Block all tool calls when the security scanner encounters any error. No configuration, no exceptions.

Rejected because:

A misconfigured or temporarily unavailable security scanner halts all agent activity platform-wide.
A single invalid regex in a YAML pattern file (e.g., a typo) would take down the entire CODITECT agent runtime until a hotfix is deployed.
Developer and pilot tenants — the majority of current CODITECT users — would experience unacceptable availability impacts from routine pattern library maintenance operations.
The ClawGuard ecosystem itself does not implement always-fail-closed; even ClawGuardian's most aggressive action (block) applies to detected threats, not to scanner failures.

Alternative B: Always Fail-Open

Approach: Always allow tool calls through on any scanner failure. Log and alert, but never block due to scanner error.

Rejected because:

Enterprise and compliance tenants operating under SOC 2, HIPAA, PCI-DSS, or similar frameworks cannot accept uninspected tool calls as a normal operating state.
Ralph Wiggum autonomous loops operating unattended for hours with a failed security scanner represent an unacceptable risk profile.
Purely fail-open provides no security guarantee for high-risk tenants, making the security layer theater rather than substance for those use cases.

Alternative C: Circuit Breaker Pattern

Approach: Implement a circuit breaker that tracks consecutive scan failures. After N failures, open the circuit (fail-open for a cooldown period), then attempt recovery. Transitions: closed (normal) → open (failing) → half-open (testing recovery).

Not selected as primary (may be implemented as enhancement):

The circuit breaker pattern is well-suited for transient failures (database unavailability, network timeout) but adds implementation complexity.
The simpler fail-open/fail-closed with timeout covers the majority of practical failure modes.
Circuit breaker is noted as a future enhancement in the implementation notes below, particularly for the Ralph Wiggum context.

Alternative D: Fail-Open with Human Confirmation Gate

Approach: On scanner failure, pause the tool call and present a human confirmation prompt before proceeding. Agent execution is suspended until a human approves.

Rejected because:

Requires a human operator to be present and responsive, which is incompatible with Ralph Wiggum autonomous loops by design.
Adds latency to all tool calls during a failure window.
The confirm action (from ClawGuardian's model) is appropriate for security policy decisions (e.g., "this looks like a sensitive operation, confirm?") but not for scanner infrastructure failures.

Implementation Notes

Configuration field: security.fail_behavior in config/tenants/{tenant-id}/security.yaml
Default value: open (platform-level default in config/defaults/security.yaml)
Hook implementation: hooks/security-pre-tool.json and hooks/security-pre-agent.json must catch all exceptions from the security layer and apply the configured fail policy before re-raising or proceeding.
Logging: SECURITY_SCAN_FAILURE event schema must be defined in config/schemas/security-events.yaml and written to sessions.db (not org.db — scan failures are operational events, not decisions).
Autonomous loop detection: scripts/core/ralph_wiggum/ provides the loop context. Security hooks should query ralph_loop_context.is_active() to determine escalation.
Scan timeout: Implemented via concurrent.futures.ThreadPoolExecutor with timeout parameter, or signal.alarm() for synchronous scan paths.
Future enhancement: Circuit breaker implementation at scripts/core/security/circuit_breaker.py.

References

JaydenBeard kill switch: analyze-new-artifacts/clawguard-ai-agent-security/ — bin/clawguard.js, routes/gateway.js
ClawGuardian action types: patterns/ and hooks/before-tool-call.ts (block/redact/confirm/warn/log)
Research context: analyze-new-artifacts/clawguard-ai-agent-security/artifacts/research-context.json
Ralph Wiggum: scripts/core/ralph_wiggum/ | docs/guides/RALPH-WIGGUM-GUIDE.md
Related ADRs: ADR-001 (Security Layer Architecture), ADR-004 (Risk Scoring)

Context​

Decision​

Default Behavior (Fail-Open)​

Configurable Fail-Closed (Per Tenant)​

Ralph Wiggum Autonomous Loop Override​

Scan Timeout​

Consequences​

Positive​

Negative​

Neutral​

Alternatives Considered​

Alternative A: Always Fail-Closed​

Alternative B: Always Fail-Open​

Alternative C: Circuit Breaker Pattern​

Alternative D: Fail-Open with Human Confirmation Gate​

Implementation Notes​

References​

Context

Decision

Default Behavior (Fail-Open)

Configurable Fail-Closed (Per Tenant)

Ralph Wiggum Autonomous Loop Override

Scan Timeout

Consequences

Positive

Negative

Neutral

Alternatives Considered

Alternative A: Always Fail-Closed

Alternative B: Always Fail-Open

Alternative C: Circuit Breaker Pattern

Alternative D: Fail-Open with Human Confirmation Gate

Implementation Notes

References