Test Executor Skill
Purpose
Execute tests in adaptive loops, analyze test results, generate structured failure reports, and iterate with fixes until all tests pass. Works universally with any testing framework or project structure through intelligent adaptation and discovery.
When to Use This Skill
Use this skill when:
- Test plan is ready (from
test-plan-generatoror manual creation) - Ready to execute tests for implemented features
- Need to validate implementation correctness
- Want systematic test execution with detailed failure reporting
- Need to iterate on test failures with clear diagnostics
- Running tests as part of implementation validation
Test Execution Workflow
Phase 1: Preparation
-
Read Test Plan
- Locate test-plan.md or equivalent
- Identify tests to execute
- Group tests by type (E2E, API, unit, integration, performance)
- Determine execution order
-
Discover Project Test Setup
- Identify testing framework
- Find test commands
- Locate test files
- Check for test configuration
-
Ensure Services are Running
- Identify required services (backend, frontend, database, etc.)
- Check if services are running
- Start services if needed
- Verify service health
Phase 2: Test Execution Loop
For each test in plan:
-
Execute Test
- Run appropriate test command
- Capture output (stdout, stderr)
- Record execution time
- Determine pass/fail status
-
Analyze Results
- Parse test output
- Extract error messages
- Identify failure causes
- Categorize failure type
-
If Test Passes:
- Mark test as complete in plan:
- [x] - Continue to next test
- Mark test as complete in plan:
-
If Test Fails:
- Generate failure report
- Add to test-failures.md
- Continue to next test (or stop if critical failure)
-
Update Progress
- Track: X/Y tests passed
- Update test plan with results
Phase 3: Report Generation
-
Create Test Failure Report
- Document failed tests
- Include error messages and stack traces
- Identify probable causes
- Suggest fixes
- Save as
test-failures.md
-
Summary Statistics
- Total tests run
- Passed / Failed / Skipped
- Execution time
- Success rate
Adaptive Test Discovery
Discovering Test Framework
Backend/Unit Tests:
# Look for config files and dependencies
package.json → "jest", "mocha", "vitest"
requirements.txt → "pytest", "unittest"
*.csproj → MSTest, xUnit, NUnit
Cargo.toml → built-in Rust tests
go.mod → built-in Go tests
Frontend/E2E Tests:
# Look for E2E frameworks
package.json → "playwright", "cypress", "puppeteer"
Check for test directories: e2e/, tests/, __tests__/
API Tests:
# Look for API test patterns
*.http files (REST Client)
*.test.ts with fetch/axios calls
curl commands in scripts
Postman collections
Discovering Test Commands
Strategy:
-
Check package.json scripts:
{ "scripts": { "test": "jest", "test:e2e": "playwright test", "test:unit": "vitest run" } } -
Check Makefile:
test: pytest tests/ test-e2e: npm run test:e2e -
Check CI/CD config:
.github/workflows/*.yml.gitlab-ci.ymlazure-pipelines.yml
-
Try common patterns:
npm test dotnet test pytest go test ./... cargo test make test -
Ask user if uncertain: "How do you run tests in this project?"
Test Types and Execution
1. E2E (End-to-End) Tests
Characteristics:
- Test complete user flows
- Require running frontend + backend
- Use browser automation (Playwright, Cypress, etc.)
- Typically slowest tests
Execution with MCP Playwright:
# If using Playwright via MCP
# Tests are executed through MCP Playwright tools
# Navigate, click, fill forms, assert results
Execution with npm:
npm run test:e2e
# or
npx playwright test
npx cypress run
Services Required:
- Frontend dev server (e.g., http://localhost:5174)
- Backend API server (e.g., http://localhost:5001)
- Database (e.g., PostgreSQL on :5432)
Example Test from Plan:
- [ ] E2E: User can submit form and see confirmation
Execution:
- Ensure all services running
- Run E2E test command
- Parse output for pass/fail
- Capture screenshots if failed
2. API Tests
Characteristics:
- Test backend endpoints directly
- Don't require frontend
- Use HTTP requests (curl, httpie, fetch, etc.)
- Faster than E2E tests
Execution with curl:
# Example from test plan:
# "Test POST /api/forms creates form in database"
curl -X POST http://localhost:5001/api/forms \
-H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type: application/json" \
-d '{"title":"Test Form","description":"Test"}'
# Check response status code
# Verify response body
# Query database to confirm creation
Execution with test framework:
# If project has API test suite
npm run test:api
dotnet test --filter Category=API
pytest tests/api/
Services Required:
- Backend API server
- Database
3. Unit Tests
Characteristics:
- Test individual functions/components
- No external dependencies
- Fast execution
- Largest quantity typically
Execution:
# JavaScript/TypeScript
npm test
npm run test:unit
jest
vitest run
# .NET
dotnet test
dotnet test --filter Category=Unit
# Python
pytest tests/unit/
python -m pytest
# Go
go test ./...
# Rust
cargo test
Services Required:
- None (unit tests are isolated)
4. Integration Tests
Characteristics:
- Test component interactions
- May require database or external services
- Medium execution speed
Execution:
# Similar to unit tests but may need services
dotnet test --filter Category=Integration
pytest tests/integration/
npm run test:integration
Services Required:
- Database (often)
- External APIs (sometimes)
5. Performance Tests
Characteristics:
- Test response time, throughput, resource usage
- Require production-like environment
- Generate metrics
Execution:
# Load testing tools
# ab (Apache Bench), wrk, k6, JMeter
# Example: 100 requests, 10 concurrent
ab -n 100 -c 10 http://localhost:5001/api/forms
# Parse output for:
# - Requests per second
# - Response times (mean, median, p95, p99)
# - Failures
Service Management
Starting Required Services
Adaptive Service Detection:
-
Frontend:
# Detection: package.json with "dev" script # Start: npm run dev (background) # Health check: curl http://localhost:5174 -
Backend:
# .NET: dotnet run (background) # Node: npm start or node server.js # Python: python app.py or flask run # Go: go run main.go -
Database:
# Detection: docker-compose.yml # Start: docker-compose up -d postgres # Health check: pg_isready or curl health endpoint
Start Services Script (template in bundled resources):
#!/bin/bash
# scripts/start_services.sh (customizable)
echo "Starting services..."
# Start database
docker-compose up -d postgres
sleep 2
# Start backend
cd backend && dotnet run &
BACKEND_PID=$!
sleep 5
# Start frontend
npm run dev &
FRONTEND_PID=$!
sleep 3
echo "Services started"
echo "Backend PID: $BACKEND_PID"
echo "Frontend PID: $FRONTEND_PID"
Checking Service Health
# Backend health check
curl http://localhost:5001/health || echo "Backend not ready"
# Frontend health check
curl http://localhost:5174 || echo "Frontend not ready"
# Database health check
docker exec postgres pg_isready -U user -d db
Test Output Parsing
Parse Strategy
Different testing frameworks have different output formats. Parse adaptively:
Jest/Vitest Output:
PASS tests/unit/EmailService.test.ts
✓ sends email successfully (45ms)
✓ handles errors gracefully (12ms)
Test Suites: 1 passed, 1 total
Tests: 2 passed, 2 total
Parsing:
- Look for
PASSorFAIL - Extract test names and times
- Count passed/failed
dotnet test Output:
Passed! - Failed: 0, Passed: 10, Skipped: 0, Total: 10, Duration: 2 s
Parsing:
- Extract counts: Failed, Passed, Skipped
- Extract duration
pytest Output:
====== 5 passed, 2 failed in 3.42s ======
Parsing:
- Extract passed/failed counts
- Extract duration
Script: scripts/parse_test_output.py (bundled) can parse common formats.
Failure Analysis
Categorizing Failures
1. Timeout Errors
- Symptom: "Timeout waiting for...", "Request timeout"
- Probable Cause: Service not started, slow response
- Suggested Fix: Check services running, increase timeout
2. Assertion Failures
- Symptom: "Expected X but got Y"
- Probable Cause: Logic error, incorrect test expectations
- Suggested Fix: Debug logic, verify test correctness
3. Connection Errors
- Symptom: "Connection refused", "Cannot connect to..."
- Probable Cause: Service not running, wrong port/URL
- Suggested Fix: Start services, check configuration
4. Authentication Errors
- Symptom: "401 Unauthorized", "403 Forbidden"
- Probable Cause: Missing/invalid token, expired credentials
- Suggested Fix: Verify auth flow, refresh tokens
5. Data Errors
- Symptom: "Null reference", "Undefined property"
- Probable Cause: Missing data, incorrect data shape
- Suggested Fix: Check data fixtures, verify API responses
Failure Report Format
test-failures.md Template
# Test Failure Report
**Date:** [Date]
**Execution:** [Run #X]
## Summary
- **Total Tests:** X
- **Passed:** Y
- **Failed:** Z
- **Success Rate:** Y/X %
---
## Failed Test #1: [Test Name]
**Test File:** `path/to/test.spec.ts`
**Failure Type:** [Timeout / Assertion / Connection / etc.]
**Error Message:**
[Full error message and stack trace]
**Probable Cause:**
[Analysis of why test failed]
**Suggested Fix:**
[Specific actions to fix]
**Related Code:**
- `src/path/to/component.ts:line`
- `backend/path/to/service.cs:line`
---
## Failed Test #2: [Test Name]
[Same structure as above]
---
## Next Steps
1. [Action 1 to fix failures]
2. [Action 2 to fix failures]
3. Re-run tests after fixes
---
**Report for:** `test-fixer` skill
Iteration Strategy
Test-Driven Iteration
- Run all tests → Generate failure report
- Use
test-fixerskill → Fix failures - Re-run failed tests → Verify fixes
- Repeat until all pass
Batch vs Individual
Batch Execution (default):
- Run all tests at once
- Generate one consolidated report
- More efficient
Individual Execution:
- Run one test at a time
- Generate report per test
- Use when tests interfere with each other
Stop on First Failure
Optional strategy for critical tests:
- Run tests sequentially
- Stop on first failure
- Fix before continuing
- Use for smoke tests or critical path tests
Tips for Effective Test Execution
- Start Services First: Always verify services are running
- Run in Order: Execute tests in logical order (unit → integration → E2E)
- Parse Carefully: Extract meaningful error messages
- Categorize Failures: Identify failure patterns
- Detailed Reports: Provide enough info for test-fixer skill
- Track Progress: Update test plan with results
- Retry Flaky Tests: Some tests may be flaky, retry once
- Isolate Failures: Determine if failures are related
- Performance Baseline: Track test execution time
- Clean State: Ensure clean database/state between test runs
Bundled Resources
scripts/parse_test_output.py- Parse test output to structured formatscripts/start_services.sh- Template for starting project servicesreferences/test-report-template.md- Template for failure reportsreferences/test-execution-patterns.md- Execution patterns by test type