Web Browser Skill (agent-browser) Skill

Web Browser Skill (agent-browser)

Use agent-browser for web automation. It runs a headless Chromium instance by default and exposes a CLI optimized for AI agents.

Full command reference: agent-browser --help

Installation

npm install -g agent-browser
agent-browser install              # Download Chromium
# Linux only:
agent-browser install --with-deps  # Install system deps

Core Workflow (recommended)

Open a page
```
agent-browser open https://example.com
```

Get a snapshot (refs)

agent-browser snapshot -i        # Interactive elements only
# or JSON for machine parsing
agent-browser snapshot -i --json

Interact using refs

agent-browser click @e2
agent-browser fill @e3 "test@example.com"
agent-browser get text @e1

Re-snapshot after changes
```
agent-browser snapshot -i --json
```

Refs (@e1, @e2, …) are deterministic and ideal for AI workflows.

Common Commands

agent-browser open <url>            # Navigate (alias: goto)
agent-browser snapshot              # Accessibility tree with refs
agent-browser click <sel|@ref>
agent-browser fill <sel|@ref> <text>
agent-browser type <sel|@ref> <text>
agent-browser press <key>           # e.g. Enter, Tab, Control+a
agent-browser get text <sel|@ref>
agent-browser screenshot [path]     # Use --full for full page
agent-browser close                 # Close browser

Semantic Finders (optional)

agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"

Helpful Options

Headed mode (visible browser):

agent-browser open https://example.com --headed

Persistent profile (cookies/logins):

agent-browser --profile ~/.myapp-profile open https://example.com

Isolated sessions:

agent-browser --session agent1 open https://example.com

Agent-friendly JSON output:

agent-browser snapshot -i --json
agent-browser get text @e1 --json

Local files (file://):

agent-browser --allow-file-access open file:///path/to/page.html

When to Use

Use this skill whenever the agent needs to browse the web, inspect pages, click buttons, fill forms, or capture screenshots.

Agent Skills: Web Browser Skill (agent-browser)

Install this agent skill to your local

Skill Files