/browser - Agentic Browser Control
Control a browser directly from your CODITECT session. Navigate to pages, interact with elements, take screenshots, and extract data using snapshot-based element targeting with 93% token reduction compared to raw HTML.
Usage
# Navigation
/browser navigate "https://example.com"
/browser back
/browser forward
/browser reload
# Snapshot (get interactive elements with refs)
/browser snapshot # Full page snapshot
/browser snapshot -i # Interactive elements only (recommended)
/browser snapshot -s "#main" # Scoped to CSS selector
# Interaction (use @eN refs from snapshot)
/browser click @e1
/browser fill @e2 "search query"
/browser type @e3 "Hello" # Types character by character
/browser select @e4 "option"
/browser hover @e5
# Screenshots
/browser screenshot # Full page
/browser screenshot /tmp/page.png # Save to path
/browser screenshot --element @e3 # Element only
# Data extraction
/browser gettext @e1 # Get element text
/browser eval "document.title" # Run JavaScript
/browser url # Get current URL
/browser title # Get page title
# Session management
/browser launch # Start new browser
/browser launch --headed # Visible browser window
/browser close # Close browser
/browser session list # List active sessions
# State persistence
/browser state save /tmp/auth.json # Save cookies/localStorage
/browser state load /tmp/auth.json # Restore saved state
System Prompt
EXECUTION DIRECTIVE:
When the user invokes /browser, you MUST:
- Check daemon status — Verify agent-browser daemon is running
- Auto-launch if needed — Start daemon on first use (headless by default)
- Execute the action — Run the appropriate agent-browser command
- Display results — Show snapshot output, screenshot path, or extracted data
- Cache state — Store latest snapshot in context for
/cxqqueries
Daemon Management:
# Check if daemon is running
SOCKET_PATH="${XDG_RUNTIME_DIR:-$HOME/.agent-browser}/default.sock"
if [ ! -S "$SOCKET_PATH" ]; then
# Auto-launch daemon
agent-browser launch
fi
Action Dispatch:
| Action | agent-browser Command |
|---|---|
navigate <url> | agent-browser navigate "<url>" |
snapshot [-i] [-c] [-d N] [-s sel] | agent-browser snapshot [flags] |
click <ref> | agent-browser click <ref> |
fill <ref> <value> | agent-browser fill <ref> "<value>" |
type <ref> <text> | agent-browser type <ref> "<text>" |
screenshot [path] | agent-browser screenshot [path] |
gettext <ref> | agent-browser gettext <ref> |
eval <script> | agent-browser eval "<script>" |
url | agent-browser url |
title | agent-browser title |
back | agent-browser back |
forward | agent-browser forward |
reload | agent-browser reload |
launch [--headed] | agent-browser launch [--headed] |
close | agent-browser close |
state save <path> | agent-browser state_save "<path>" |
state load <path> | agent-browser launch --state "<path>" |
session list | List active daemon sockets |
Snapshot Display:
When displaying snapshot output, format as a readable tree:
Page: https://example.com - "Example Page"
Interactive Elements:
@e1 textbox "Email"
@e2 textbox "Password"
@e3 button "Sign In"
@e4 link "Forgot password?"
Options
| Option | Description |
|---|---|
<action> | Browser action to perform (required) |
-i | Interactive elements only (for snapshot) |
-c | Compact mode (for snapshot) |
-d N | Depth limit (for snapshot) |
-s <selector> | CSS scope (for snapshot) |
--headed | Show browser window (for launch) |
--session <name> | Named session (default: "default") |
--engine <name> | Browser engine: chromium, firefox, webkit |
--json | Output raw JSON response |
Examples
Quick Page Inspection
/browser navigate "https://example.com"
/browser snapshot -i
Fill and Submit Form
/browser navigate "https://login.example.com"
/browser snapshot -i
/browser fill @e1 "user@example.com"
/browser fill @e2 "password"
/browser click @e3
Screenshot for Debugging
/browser navigate "https://app.example.com/dashboard"
/browser screenshot /tmp/dashboard.png
Extract Page Data
/browser navigate "https://example.com/pricing"
/browser eval "JSON.stringify([...document.querySelectorAll('.plan')].map(p => ({name: p.querySelector('h2').textContent, price: p.querySelector('.price').textContent})))"
Success Output
Browser: navigated to https://example.com
Snapshot: 12 interactive elements found (use @e1-@e12)
Screenshot: saved to /tmp/screenshot.png
Related
- Command: /browser-extract — Content extraction to markdown/JSON
- Command: /browser-research — Multi-page agentic extraction
- Agent: coditect-browser-agent
- Skill: browser-automation-patterns
- Skill: browser-content-extraction
- Hook: browser-auto-launch.py
- Hook: browser-screenshot-on-error.py
Principles
This command embodies:
- #3 Complete Execution — Auto-launches daemon, executes action, displays results
- #6 Clear, Understandable — Element refs simplify targeting
- #1 Self-Provisioning — Daemon auto-starts on first use
Command Version: 1.0.0 Created: 2026-02-08 Author: CODITECT Core Team