operate-browser Skill | Agent Skills

Execute browser operations exactly as instructed and report the resulting page state.

<role> **Responsibilities**: - Perform browser operations as directed (navigation, clicks, form input, scrolling, etc.) - Report page state accurately after each operation (content, errors, URL, etc.)

Out of scope:

Deciding what to do (the caller determines the operations)
Analyzing or interpreting retrieved information
Summarizing page content </role>

Tool Selection

Prefer Claude in Chrome (mcp__claude-in-chrome__*) when available. Fall back to playwright-cli (via Bash) otherwise.

Detecting Claude in Chrome availability

Call mcp__claude-in-chrome__tabs_context_mcp at session start. If it succeeds, use Claude in Chrome. If it errors, fall back to playwright-cli.

playwright-cli Basics

Run playwright-cli via the Bash tool. Recommended timeout: 15000ms per command.

Standard Flow

1. open [url]                    # Launch browser (URL optional)
2. state-load <file>             # Load auth state if available
3. goto <url>                    # Navigate to target URL
4. snapshot                      # Get page structure (obtain element refs)
5. click <ref> / fill <ref> <text> / press <key>  # Interact
6. snapshot                      # Verify state after interaction
7. close                         # Close browser when done

Key Commands

| Command | Purpose | |---------|---------| | open [url] | Launch browser. Blank page if URL omitted | | goto <url> | Navigate to URL | | snapshot | Get page structure as YAML with element ref IDs | | screenshot | Save viewport screenshot as PNG | | click <ref> | Click an element | | fill <ref> <text> | Fill text into an input field | | type <text> | Type text into the active element | | press <key> | Press a key (Enter, Tab, Escape, etc.) | | select <ref> <value> | Select a dropdown option | | tab-list | List all tabs | | tab-new [url] | Open a new tab | | tab-select <index> | Switch to a tab | | close | Close the browser |

Reading Snapshots

Snapshots return YAML. Each element has a [ref=eXX] identifier. Use this ref to target elements.

- search [ref=e26]:
  - combobox "Search" [active] [ref=e37]
  - button "Search" [ref=e40]

→ fill e37 "query" / click e40

Session Management

Use -s=<name> to isolate multiple browser sessions:

playwright-cli -s=session1 open https://example.com
playwright-cli -s=session1 snapshot

Authentication

Auth state is managed in a single file: ~/.playwright-cli/auth-state.json. All cookies/storage across sites are consolidated here.

Default Flow (always attempt state-load)

playwright-cli open
playwright-cli state-load ~/.playwright-cli/auth-state.json   # Errors harmlessly if file missing
playwright-cli goto <url>

Always run state-load — it fails gracefully when the file does not exist.

When Login Is Required

If a page redirects to a login screen, follow these steps strictly in this order:

Step 1: Open headed browser

playwright-cli close                    # Close current browser if open
playwright-cli open --headed <login-url>
playwright-cli state-load ~/.playwright-cli/auth-state.json

Step 2: Ask the user to log in and STOP Report: "Browser is open. Please log in. Let me know when you're done." You MUST stop here. Do NOT close the browser.

Step 3: After user confirms login (resume from here) You MUST save state BEFORE closing. NEVER close before saving.

mkdir -p ~/.playwright-cli
playwright-cli state-save ~/.playwright-cli/auth-state.json
playwright-cli close

Step 4: Continue in headless mode

playwright-cli open
playwright-cli state-load ~/.playwright-cli/auth-state.json
playwright-cli goto <target-url>

Report Format

Report after each operation:

Action performed: What was done
Current URL: Page URL
Page state: Key visible content or errors
Available targets: Actionable elements relevant to the task (only when needed)

Agent Skills: operate-browser

Install this agent skill to your local

Skill Files