UAT Testing Skill | Agent Skills

UAT Testing

End-to-end UAT workflow: analyze → generate test cases → set up environment → execute → report.

Workflow Overview

Phase 1: Discovery
  ├─ Read spec/requirements (if provided)
  ├─ Analyze git branch diff against base
  └─ Identify what was built/changed

Phase 2: Environment Setup
  ├─ Determine target: local or production/staging?
  ├─ Local path:
  │   ├─ Check for .env.local → ask user if missing
  │   └─ Start dev server (npm run dev or equivalent)
  ├─ Prod/staging path:
  │   └─ Get application URL, verify reachable
  └─ Ask user for test account credentials (both paths)

Phase 3: Test Case Generation
  ├─ Write uat-test-cases.md (see references/test-case-template.md)
  ├─ Cover: happy path, errors, persistence, auth, responsive
  └─ Present to user for review before execution

Phase 4: Test Execution
  ├─ Start dev server (framework-appropriate command)
  ├─ Authenticate with test account via browser
  ├─ Execute test cases with Playwright MCP tools
  ├─ Take screenshots as evidence
  └─ If a test fails → document the failure, continue testing

Phase 5: Reporting
  ├─ Write uat-results.md (see references/results-template.md)
  ├─ Include: summary table, detailed results, screenshots, fixes
  └─ Present overall PASS/FAIL verdict

Phase 1: Discovery

Finding the Spec

Look for feature context in this priority order:

User-provided spec: If user gives a path to a spec, requirements doc, or thread — read it first
Spec directory: Check specs/[branch-name]/spec.md or specs/[feature-id]/spec.md
Tasks/todo: Check tasks/todo.md or similar for what was planned
PR description: If a PR exists, read it via gh pr view

Analyzing Branch Changes

# What branch are we on?
git branch --show-current

# What changed vs base branch?
git log main..HEAD --oneline
git diff main..HEAD --stat

# What files were modified?
git diff main..HEAD --name-only

Read the changed files to understand what was built. Focus on:

New components or pages (user-facing features)
Modified API routes or backend logic (behavior changes)
Schema changes (data model additions)
Config changes (new env vars needed)

Output

Summarize findings to the user:

"This branch adds [feature] based on [spec]"
"Key changes: [list of modified areas]"
"I'll generate test cases covering [scope]"

Phase 2: Environment Setup

Determine Target Environment

Ask the user (or infer from context) whether this is local or production testing:

Are we testing locally (I'll start the dev server) or against a deployed environment (provide the URL)?

| Aspect | Local | Production / Staging | |--------|-------|----------------------| | Server | Start with npm run dev | Already running at provided URL | | Env vars | Need .env.local with all keys | N/A — app is already configured | | Test data | May need seeding or setup | Uses existing real/staging data | | Auth | Test account on local auth provider | Test account on prod/staging auth | | Base URL | http://localhost:3000 (or configured port) | https://app.example.com |

Local Environment Path

Check Environment Variables

# Check if env file exists
ls -la .env.local .env 2>/dev/null

If .env.local does not exist, ask the user:

I don't see a .env.local file. Do you have an environment file or credentials I should use? Common needs:

Database connection (e.g., Convex URL/deploy key)

Auth provider keys (e.g., Clerk publishable/secret key)

API keys (e.g., AI model keys, external service keys)

Please provide the file path or paste the required variables.

Start Dev Server

Detect the framework and start the dev server:

| Signal | Framework | Start Command | |--------|-----------|---------------| | next.config.* | Next.js | npm run dev | | vite.config.* | Vite | npm run dev | | angular.json | Angular | ng serve | | nuxt.config.* | Nuxt | npm run dev | | package.json scripts | Generic | Read dev script |

If the project needs multiple services (e.g., Next.js + Convex), start all of them.

Verify the server is reachable before proceeding:

curl -s -o /dev/null -w "%{http_code}" http://localhost:3000

Production / Staging Environment Path

No server startup or env file needed. Ask the user for:

Please provide:

Application URL (e.g., https://app.example.com)

Any test data considerations (is there a staging org/business to use?)

Verify the URL is reachable:

curl -s -o /dev/null -w "%{http_code}" https://app.example.com

Request Test Account (Both Environments)

Always ask before attempting to authenticate:

For UAT testing I need a test account to log in. Most auth systems require email verification, so I can't create a new account.

Please provide:

Login URL (if not the default /sign-in)

Email/username for the test account

Password

Any 2FA/MFA steps I should be aware of

Which business/org/team to select after login (if applicable)

Phase 3: Test Case Generation

Read references/test-case-template.md for the full template format.

Generation Rules

Map spec to test cases: Each user story or requirement gets at least one test case
Prioritize: Critical (P1) = core happy path; High (P2) = error handling; Medium (P3) = edge cases
Be specific: Each step should be a concrete action ("Type 'hello' in the input field"), not vague ("Test the input")
Include expected data: Use actual values from the seeded test data or spec examples
Cover persistence: Always include a "reload page and verify" test case for stateful features
Cover auth boundaries: If the feature has role-based access, test with different roles

Writing the File

Write uat-test-cases.md to the spec directory (e.g., specs/[feature-id]/uat-test-cases.md) or to the project root if no spec directory exists.

Present the test cases to the user for review before executing. Ask:

I've generated [N] test cases in [file path]. Review them and let me know:

Any test cases to add or remove?

Any priority adjustments?

Ready to proceed with execution?

Phase 4: Test Execution

Playwright MCP Setup

Load the Playwright tools via ToolSearch before starting:

ToolSearch: "playwright browser"

Key tools: browser_navigate, browser_snapshot, browser_click, browser_fill_form, browser_type, browser_press_key, browser_take_screenshot, browser_wait_for, browser_console_messages, browser_evaluate.

See references/agent-guide.md for the full tool reference and auth handling patterns.

Execution Pattern

For each test case:

Navigate to the target page
Wait for page load (browser_wait_for or snapshot check)
Perform the test actions (click, type, submit)
Wait for the expected result
Verify via browser_snapshot (check accessibility tree for expected text/elements)
Take browser_take_screenshot for evidence
Check browser_console_messages for JS errors
Record PASS or FAIL with details

Failure Handling

When a test fails:

Document exactly what happened (expected vs actual)
Take a screenshot of the failure state
Check console for errors (browser_console_messages)
Continue testing other cases — do not stop on first failure
If the fix is obvious and small (< 5 lines), ask user if you should fix it. If yes, fix, re-verify, and document the fix in the results report

Screenshot Naming Convention

Save screenshots to the project root with descriptive names:

uat-tc001-invoice-card.png
uat-tc002-cashflow-dashboard.png
uat-tc003-fail-missing-button.png

Phase 5: Reporting

Read references/results-template.md for the full report format.

Writing the Report

Write uat-results.md alongside the test cases file.

Include:

Summary table: PASS/FAIL/BLOCKED/NOT TESTED counts
Per-test results: What happened for each test case with specific details
Fixes applied: Any code changes made during testing, with root cause analysis
Component status: Build pass/fail for each modified component
Screenshots: Reference all captured screenshots
Remaining issues: Known problems that weren't fixed
Overall verdict: PASS (all critical tests pass) or FAIL (any critical test fails)

Verdict Rules

PASS: All Critical (P1) and High (P2) test cases pass. Medium/Low failures are documented but don't block.
FAIL: Any Critical (P1) test case fails, OR more than 50% of High (P2) cases fail.
PARTIAL: Critical tests pass but significant High (P2) failures exist. Merging is a judgment call.

References

Test case format: See references/test-case-template.md
Results report format: See references/results-template.md
Agent setup and Playwright tools: See references/agent-guide.md

Agent Skills: UAT Testing

Install this agent skill to your local

Skill Files