Stably CLI Assistant
AI-assisted Playwright test management: plan, create, run, fix, and maintain tests via CLI.
Pre-flight
Always run stably --version first. If not found, install with npm install -g stably or use npx stably. Requires Node.js 20+, Playwright, and a Stably account.
Warning: Do NOT run
stablywith no arguments — it launches interactive chat mode that requires human input and will hang an AI agent.
Command Reference
| Intent | Command |
|--------|---------|
| Interactive AI chat (human only) | stably (no args) — do NOT invoke from an AI agent |
| Plan test coverage | stably plan |
| Plan specific area | stably plan "focus on checkout" |
| Generate test from prompt | stably create "description" |
| Generate test from branch diff | stably create (no prompt) |
| Run tests | stably test |
| Run tests with remote env | stably --env staging test |
| Fix failing tests | stably fix [runId] |
| Initialize project | stably init |
| Install browsers | stably install (use --with-deps chromium in CI) |
| List remote environments | stably env list |
| Inspect env variables | stably env inspect <name> |
| Auth | stably login / logout / whoami |
| Verify app behavior | stably verify "description" |
| Verify with URL | stably verify "description" --url http://localhost:3000 |
| List recent test runs | stably runs list |
| View run details | stably runs view <runId> |
| Flaky test analytics | stably analytics flaky |
| Failure analytics | stably analytics failures |
| Update CLI | stably upgrade [--check] |
Global Options
| Option | Description |
|--------|-------------|
| --cwd <path> / -C | Change working directory |
| --env <name> | Load vars from remote Stably environment |
| --env-file <path> | Load vars from local file (repeatable) |
| --browser <type> | Browser type: local or cloud (default: local). Also settable via STABLY_CLOUD_BROWSER=1 env var |
| --verbose / -v | Verbose logging |
| --no-telemetry | Disable telemetry |
Env var precedence (highest → lowest): Stably auth (STABLY_API_KEY) → process.env → --env-file → --env
Core Commands
stably plan [prompt...]
Discovers coverage gaps by exploring the codebase and generates test.fixme() skeleton files with priority tags (@p0–@p3). Without a prompt, analyzes the entire app; with a prompt, scopes discovery to that area.
stably plan # full autonomous discovery
stably plan "focus on checkout and auth" # scoped plan
stably plan "plan tests for features in this PR" # PR-scoped
Outputs one spec file per group in the repo's existing test directory. Each file contains test.describe() flows with test.fixme() scenarios. Existing files with real test() implementations are never modified. Safe to re-run (idempotent, max 200 flows per invocation).
Typical workflow: stably plan → team reviews/adjusts priorities → stably create "implement all @p0 tests" → stably test → stably fix
Also writes key project discoveries (auth patterns, framework conventions) to STABLY.md for future agent runs.
stably create [prompt...]
Generates tests from a prompt (quotes optional) or infers from branch diff when no prompt given.
--output <dir>— override output directory (auto-detected fromplaywright.config.tstestDiror common dirs liketests/,e2e/)--browser <type>— usecloudfor cloud browser execution (default:local)
stably create "test the checkout flow"
stably create # infer from diff
stably create "test registration" --output ./e2e
stably create "implement all @p0 tests" # implement planned test.fixme() skeletons
stably create --browser cloud "test login" # use cloud browser
Prompt tips: be specific about user flows, UI elements, auth requirements, and error states.
stably test
Runs Playwright tests with Stably reporter. Auto-enables --trace=on. All Playwright CLI options pass through:
stably test --headed --project=chromium --workers=4
stably test --grep="login" tests/login.spec.ts
stably --env staging test --headed
stably fix [runId]
Fixes failing tests using AI analysis of traces, screenshots, logs, and DOM state.
Run ID resolution: explicit arg → CI env (the CLI maps GitHub's run ID to the corresponding Stably run) → .stably/last-run.json (warns if >24h old). Requires git repo.
stably fix # auto-detect last run
stably fix abc123 # explicit run ID
Typical workflow: stably test → (failures?) → stably fix → stably test
stably verify <prompt...>
Verifies app behavior against a natural-language description without generating test files.
-u, --url <url>— target URL (otherwise auto-detected)--max-budget <dollars>— max budget in USD (default: 5)--no-interactive— disable interactive prompts--browser <type>— usecloudfor cloud browser execution (default:local)
Exit codes: 0 = PASS, 1 = FAIL, 2 = INCONCLUSIVE.
stably verify "users can sign up with email"
stably verify "checkout flow works" --url http://localhost:3000
stably verify "login page loads" --no-interactive
Agent note: The default $5 budget is sufficient for most verifications. Avoid increasing --max-budget without explicit user approval.
stably runs list [options]
Lists recent test runs for the current project.
-b, --branch <name>— filter by git branch-n, --limit <number>— max results (default 20, max 100)--after <runId>/--before <runId>— cursor-based pagination by run ID--source <source>— filter by source (local,ci,web)-s, --status <status>— filter by status:queued,running,passed,failed,timedout,cancelled,interrupted--suite <name>— filter by test suite--trigger <trigger>— filter by trigger type:manual,scheduled,ui,api,github_action,suite_run--json— output as JSON (preferred for AI agents)
stably runs list # recent runs
stably runs list --status failed # find failed runs
stably runs list --branch main --limit 5 # recent runs on main
stably runs list --json # machine-readable output
Agent note: Always use --json for machine-readable output when parsing run data programmatically.
stably runs view <runId> [options]
Shows details for a specific test run including metadata, issues with root causes, and individual test results.
--json— output as JSON (preferred for AI agents)
stably runs view abc123
stably runs view abc123 --json
stably analytics flaky [options]
Shows the most flaky tests ranked by flaky rate over a configurable time window.
--days <n>— look-back window in days (1–90, default: 7)-b, --branch <name>— filter by branch name-n, --limit <n>— max rows returned (1–100, default: 10)--json— output as JSON (preferred for AI agents)
stably analytics flaky # flaky tests, last 7 days
stably analytics flaky --days 30 --limit 20 # wider window, more results
stably analytics flaky --branch main --json # machine-readable, main only
stably analytics failures [options]
Shows the most failing tests ranked by failure rate over a configurable time window.
--days <n>— look-back window in days (1–90, default: 7)-b, --branch <name>— filter by branch name-n, --limit <n>— max rows returned (1–100, default: 10)--json— output as JSON (preferred for AI agents)
stably analytics failures # failing tests, last 7 days
stably analytics failures --days 14 # last 2 weeks
stably analytics failures --branch main --json # machine-readable, main only
Agent note: Always use --json for machine-readable output when parsing analytics data programmatically.
Remote Environments
stably env list — list environments in current project.
stably env inspect <name> — show variable names/metadata (values never printed).
Use --env for team-shared dashboard variables; --env-file for local .env files. Both combine (--env wins).
Long-Running Commands (AI Agents)
stably plan, stably create, stably fix, and stably verify are AI-powered and can take several minutes.
| Agent | Configuration |
|-------|--------------|
| Claude Code | timeout: 600000 (preferred — retains command output), or run_in_background: true when parallel work is needed |
| Cursor | block_until_ms: 900000 (default 30s is too short) |
stably test duration depends on suite size — use the same timeout for large suites. All other commands complete in seconds.
If a command times out or fails: retry once with --verbose for diagnostics. AI-powered commands are idempotent — retrying is safe. If failures persist, check stably whoami (auth) and network connectivity.
For general long-running command patterns (dev servers, watchers), see bash-commands skill.
Configuration
Required Env Vars
# NEVER hardcode real values — use .env files (gitignored) or CI secrets
STABLY_API_KEY=your_key # from https://auth.stably.ai/org/api_keys/
STABLY_PROJECT_ID=your_id # from dashboard
Set via .env file, --env-file, or --env (remote).
Playwright Config
stably test auto-enables tracing. Set trace: 'on' in config too for direct npx playwright test runs:
import { defineConfig, stablyReporter } from '@stablyai/playwright-test';
export default defineConfig({
use: { trace: 'on' },
reporter: [
['list'],
stablyReporter({
apiKey: process.env.STABLY_API_KEY,
projectId: process.env.STABLY_PROJECT_ID,
}),
],
});
CI/CD: GitHub Actions
Important CI patterns
- Use
continue-on-error: trueon test, fix, and PR steps so the pipeline can run fix/PR steps after failures — but always add a final step that fails the workflow if tests failed. Withoutcontinue-on-erroron the fix/PR steps, a failure there blocks downstream steps.
Self-healing test pipeline
name: E2E Tests with Auto-Fix
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20' }
- run: npm ci
- name: Install browsers
run: npx stably install
env:
STABLY_API_KEY: ${{ secrets.STABLY_API_KEY }}
STABLY_PROJECT_ID: ${{ secrets.STABLY_PROJECT_ID }}
- name: Run tests
id: test
continue-on-error: true
run: npx stably --env staging test
env:
STABLY_API_KEY: ${{ secrets.STABLY_API_KEY }}
STABLY_PROJECT_ID: ${{ secrets.STABLY_PROJECT_ID }}
- name: Auto-fix failures
if: steps.test.outcome == 'failure'
continue-on-error: true
run: npx stably --env staging fix
env:
STABLY_API_KEY: ${{ secrets.STABLY_API_KEY }}
STABLY_PROJECT_ID: ${{ secrets.STABLY_PROJECT_ID }}
- name: Create PR with fixes
if: steps.test.outcome == 'failure'
continue-on-error: true
run: |
if [ -n "$(git status --porcelain)" ]; then
BRANCH="stably-fix/$(date +%Y%m%d-%H%M%S)"
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git checkout -b "$BRANCH"
git add -A
git commit -m "fix: auto-repair failing tests"
git push origin "$BRANCH"
gh pr create \
--title "fix: auto-repair failing tests" \
--body "Automated PR from Stably Fix after test failures in run #${{ github.run_number }}." \
--base "${{ github.ref_name }}" \
--head "$BRANCH"
fi
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Fail if tests failed
if: steps.test.outcome == 'failure'
run: |
echo "::error::Tests failed"
exit 1
Troubleshooting
| Problem | Solution |
|---------|----------|
| "Not authenticated" | stably login |
| API key not recognized | stably whoami to verify |
| Tests in wrong directory | stably create "..." --output ./tests/e2e |
| Missing browser | stably install |
| Traces not uploading | Set trace: 'on' in playwright.config.ts |
| "Run ID not found" | Run stably test first, then stably fix |
Links
Docs · CLI Quickstart · Dashboard · API Keys