When to use this skill
Use this skill whenever the user wants to:
- Automate browser interactions via CLI commands
- Use browser automation for AI agents
- Navigate websites and interact with pages using command-line tools
- Use refs-based element selection for deterministic automation
- Integrate browser automation into AI agent workflows
- Capture snapshots of web pages with accessibility trees
- Fill forms, click elements, and extract content via CLI
- Use semantic locators for more reliable element selection
- Work with browser automation in agent mode with JSON output
- Manage multiple browser sessions
- Debug browser automation with headed mode
- Use authenticated sessions with custom headers
- Connect to existing browsers via CDP
- Stream browser viewport for live preview
How to use this skill
This skill is organized to match the agent-browser official documentation structure (https://github.com/vercel-labs/agent-browser/blob/main/README.md). When working with agent-browser:
-
Install agent-browser:
- Load
examples/getting-started/installation.mdfor installation instructions
- Load
-
Quick Start:
- Load
examples/quick-start/quick-start.mdfor basic workflow examples
- Load
-
Learn core commands:
- Load
examples/commands/basic-commands.mdfor basic commands (open, click, fill, etc.) - Load
examples/commands/advanced-commands.mdfor advanced commands (snapshot, eval, etc.) - Load
examples/commands/get-info/for information retrieval commands - Load
examples/commands/check-state/for state checking commands - Load
examples/commands/find-elements/for semantic locator commands - Load
examples/commands/wait/for wait commands - Load
examples/commands/mouse-control/for mouse control commands - Load
examples/commands/browser-settings/for browser configuration - Load
examples/commands/cookies-storage/for cookies and storage management - Load
examples/commands/network/for network interception - Load
examples/commands/tabs-windows/for tab and window management - Load
examples/commands/frames/for iframe handling - Load
examples/commands/dialogs/for dialog handling - Load
examples/commands/debug/for debugging commands - Load
examples/commands/navigation/for navigation commands - Load
examples/commands/setup/for setup commands
- Load
-
Understand selectors:
- Load
examples/selectors/refs.mdfor refs-based selection (@e1, @e2, etc.) - Load
examples/selectors/traditional-selectors.mdfor CSS, XPath, and semantic locators
- Load
-
Use agent mode:
- Load
examples/agent-mode/introduction.mdfor agent mode overview - Load
examples/agent-mode/optimal-workflow.mdfor optimal AI workflow - Load
examples/agent-mode/integration.mdfor integrating with AI agents
- Load
-
Advanced features:
- Load
examples/advanced/sessions.mdfor session management - Load
examples/advanced/headed-mode.mdfor debugging with visible browser - Load
examples/advanced/authenticated-sessions.mdfor authentication via headers - Load
examples/advanced/custom-executable.mdfor custom browser executable - Load
examples/advanced/cdp-mode.mdfor Chrome DevTools Protocol integration - Load
examples/advanced/streaming.mdfor browser viewport streaming - Load
examples/advanced/architecture.mdfor architecture overview - Load
examples/advanced/platforms.mdfor platform support - Load
examples/advanced/usage-with-agents.mdfor AI agent integration patterns
- Load
-
Configure options:
- Load
examples/options/global-options.mdfor global CLI options - Load
examples/options/snapshot-options.mdfor snapshot-specific options - Load
examples/options/session-options.mdfor session management options
- Load
-
Reference API documentation when needed:
api/commands.md- Complete command referenceapi/selectors.md- Selector referenceapi/options.md- Options reference
-
Use templates for quick start:
templates/basic-automation.md- Basic automation workflowtemplates/ai-agent-workflow.md- AI agent workflow template
Doc mapping (one-to-one with official documentation)
- See examples and API files → https://github.com/vercel-labs/agent-browser
Examples and Templates
This skill includes detailed examples organized to match the official documentation structure. All examples are in the examples/ directory (see mapping above).
To use examples:
- Identify the topic from the user's request
- Load the appropriate example file from the mapping above
- Follow the instructions, syntax, and best practices in that file
- Adapt the code examples to your specific use case
To use templates:
- Reference templates in
templates/directory for common scaffolding - Adapt templates to your specific needs and coding style
API Reference
- Commands API:
api/commands.md- Complete command reference with syntax and examples - Selectors API:
api/selectors.md- Selector types and usage reference - Options API:
api/options.md- All options reference
Best Practices
- Use Refs: Prefer refs (@e1, @e2) over traditional selectors for deterministic automation
- Snapshot First: Always snapshot before interacting with elements to get refs
- Agent Mode: Use
--jsonflag for machine-readable output in agent mode - Session Management: Use
--sessionto maintain state across commands - Interactive Snapshot: Use
-iflag for interactive snapshot selection - Semantic Locators: Use semantic locators (role/name) when refs are not available
- Error Handling: Check command exit codes and error messages
- Wait for Navigation: Commands automatically wait for navigation to complete
- Headed Mode: Use
--headedfor debugging, headless for production - CDP Integration: Use
--cdpfor Chrome DevTools Protocol integration - Streaming: Use
AGENT_BROWSER_STREAM_PORTfor live browser preview - Authenticated Sessions: Use
--headersfor authentication without login flows - Custom Executable: Use
--executable-pathfor serverless deployments or custom browsers - Snapshot Options: Combine
-i,-c,-d,-soptions to optimize snapshot output
Resources
- GitHub Repository: https://github.com/vercel-labs/agent-browser
- Official README: https://github.com/vercel-labs/agent-browser/blob/main/README.md
- Agent Mode Documentation: https://agent-browser.dev/agent-mode
- Issues: https://github.com/vercel-labs/agent-browser/issues
Keywords
agent-browser, CLI browser automation, AI agents, browser automation CLI, refs, snapshot, agent mode, semantic locators, browser automation tool, command-line browser, AI agent browser, deterministic selectors, accessibility tree, browser commands, web automation CLI, sessions, headed mode, authenticated sessions, CDP mode, streaming, Chrome DevTools Protocol, Playwright, browser automation for AI