Testing
This skill provides comprehensive testing capabilities including test strategy, automation setup, Test-Driven Development (TDD), test writing best practices, coverage analysis, CI/CD integration, and web application testing with Playwright.
When to Use This Skill
- When setting up test infrastructure for a project
- When creating test strategies and test plans
- When writing unit, integration, or E2E tests
- When implementing TDD/test-first development
- When analyzing test coverage and quality
- When integrating tests into CI/CD pipelines
- When testing web applications with Playwright
- When debugging test failures or improving test reliability
- When writing test fixtures, mock data, or factory functions
- When mocking external dependencies (APIs, databases, file systems)
- When organizing test file structure and test suites
- When testing async code, Promises, or event-driven behavior
- When implementing snapshot tests for UI components
- When configuring test coverage thresholds
What This Skill Does
- Test Strategy: Designs comprehensive testing strategies (unit, integration, E2E)
- Test Automation: Sets up test frameworks and automation tools
- TDD Methodology: Implements Test-Driven Development workflows (Red-Green-Refactor)
- Test Writing: Writes focused, maintainable tests with proper patterns
- Coverage Analysis: Analyzes and improves test coverage
- CI/CD Integration: Integrates tests into continuous integration pipelines
- Web App Testing: Tests web applications using Playwright
- Test Quality: Improves test reliability and maintainability
Test Strategy
Test Pyramid
Recommended Distribution:
- Unit Tests: 70% - Fast, isolated, test individual functions
- Integration Tests: 20% - Test component interactions
- E2E Tests: 10% - Test complete user workflows
Test Types:
- Functional tests (happy path, edge cases, error handling)
- Non-functional tests (performance, security, accessibility)
- Regression tests (prevent breaking changes)
- Smoke tests (critical path verification)
Framework Selection
JavaScript/TypeScript:
- Jest, Vitest, Mocha for unit/integration
- Playwright, Cypress for E2E
- React Testing Library for component testing
Python:
- pytest for unit/integration
- Selenium, Playwright for E2E
- unittest for standard library testing
Java:
- JUnit for unit tests
- TestNG for integration
- Selenium for E2E
Go:
- Built-in testing package
- Testify for assertions
Rust:
- Built-in test framework
- Cargo test for running tests
Test-Driven Development (TDD)
TDD is a design technique, not just a testing technique. It produces better-designed, more maintainable code through small, disciplined steps.
Core Principle
Write tests before code. Always. TDD forces you to think about:
- What behavior do I need?
- How will I know it works?
- What's the simplest implementation?
The Three Laws (Never Violate)
- Write NO production code without a failing test first
- Write only enough test to demonstrate one failure
- Write only enough code to pass that test
Red-Green-Refactor Cycle
Phase 1: RED - Write Failing Test
- Write ONE test that defines desired behavior
- Run test - verify it FAILS
- Verify it fails for the RIGHT reason (not syntax error)
- DO NOT write implementation yet
Phase 2: GREEN - Minimal Implementation
- Write MINIMAL code to make test pass
- Resist urge to add extra features
- Run test - verify it PASSES
- If test still fails, fix implementation (not test)
Phase 3: REFACTOR - Clean Code
- Remove code duplication (DRY)
- Improve naming for clarity
- Extract complex logic into functions
- Run ALL tests - must stay green throughout
- Check test coverage on changed lines
After REFACTOR, start new RED phase for next behavior.
Test Writing Patterns
Arrange-Act-Assert (AAA)
Structure:
- Arrange: Set up test data and conditions
- Act: Execute the code being tested
- Assert: Verify the expected outcome
Example:
describe('UserService', () => {
it('should create user with valid data', async () => {
// Arrange
const userData = { email: 'test@example.com', name: 'Test User' };
// Act
const result = await userService.createUser(userData);
// Assert
expect(result).toHaveProperty('id');
expect(result.email).toBe(userData.email);
});
});
Given-When-Then (BDD Style)
Structure:
- Given: Initial context/preconditions
- When: Action/event that triggers behavior
- Then: Expected outcome
Test Organization
File Structure:
project/
├── src/
│ └── components/
│ └── User.jsx
├── tests/
│ ├── unit/
│ │ └── User.test.jsx
│ ├── integration/
│ │ └── UserAPI.test.js
│ └── e2e/
│ └── user-flow.spec.js
├── jest.config.js
└── playwright.config.js
Coverage Analysis
Coverage Goals
Recommended Thresholds:
- Lines: 80%+
- Functions: 80%+
- Branches: 80%+
- Statements: 80%+
Critical Paths:
- Always aim for 100% coverage on critical business logic
- Authentication and authorization
- Payment processing
- Data validation
Coverage Gaps
Common Gaps:
- Error handling paths
- Edge cases
- Boundary conditions
- Integration points
Improvement Strategies:
- Identify untested code paths
- Add tests for error scenarios
- Test edge cases and boundaries
- Increase integration test coverage
CI/CD Integration
Test Pipeline
Stages:
- Unit Tests: Fast feedback, run on every commit
- Integration Tests: Run on pull requests
- E2E Tests: Run before merging to main
- Performance Tests: Run on main branch
Quality Gates:
- All tests must pass
- Coverage must meet threshold
- No critical security issues
- Performance benchmarks met
Web Application Testing with Playwright
Helper Scripts
This skill includes Python helper scripts in scripts/:
-
with_server.py- Manages server lifecycle (supports multiple servers). Always run with--helpfirst to see usage.# Single server python scripts/with_server.py --server "npm run dev" --port 5173 -- python your_automation.py # Multiple servers (e.g., backend + frontend) python scripts/with_server.py \ --server "cd backend && python server.py" --port 3000 \ --server "cd frontend && npm run dev" --port 5173 \ -- python your_automation.py
Decision Tree: Choosing Your Approach
User task → Is it static HTML?
├─ Yes → Read HTML file directly to identify selectors
│ ├─ Success → Write Playwright script using selectors
│ └─ Fails/Incomplete → Treat as dynamic (below)
│
└─ No (dynamic webapp) → Is the server already running?
├─ No → Run: python scripts/with_server.py --help
│ Then use the helper + write simplified Playwright script
│
└─ Yes → Reconnaissance-then-action:
1. Navigate and wait for networkidle
2. Take screenshot or inspect DOM
3. Identify selectors from rendered state
4. Execute actions with discovered selectors
Playwright Best Practices
- Use bundled scripts as black boxes - Use
--helpto see usage, then invoke directly - Use
sync_playwright()for synchronous scripts - Always close the browser when done
- Use descriptive selectors:
text=,role=, CSS selectors, or IDs - Add appropriate waits:
page.wait_for_selector()orpage.wait_for_timeout() - CRITICAL: Wait for
page.wait_for_load_state('networkidle')before inspection on dynamic apps
Example: Basic Playwright Script
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto('http://localhost:5173')
page.wait_for_load_state('networkidle') # CRITICAL: Wait for JS to execute
# ... your automation logic
browser.close()
Examples
See examples/ directory for:
element_discovery.py- Discovering buttons, links, and inputs on a pagestatic_html_automation.py- Using file:// URLs for local HTMLconsole_logging.py- Capturing console logs during automation
Reference Files
For detailed testing patterns and workflows, load reference files as needed:
references/framework_workflows.md- Framework-specific TDD workflows and examples for Python (pytest), JavaScript (Jest, Vitest), Java (JUnit), Go, Rustreferences/test_patterns.md- Common test patterns, test organization, naming conventions, test doubles (mocks, stubs, spies), parametrization, and anti-patternsreferences/webapp_testing.md- Web application testing patterns, Playwright best practices, and E2E testing strategiesreferences/TESTING_REPORT.template.md- Test quality report template with coverage metrics, audit findings, and recommendations
When working with specific frameworks or need detailed patterns, load the appropriate reference file.
Best Practices
Test Quality
- Isolation: Tests should be independent and runnable in any order
- Deterministic: Tests should produce consistent results
- Fast: Unit tests should run quickly (< 100ms each)
- Clear: Test names should describe what they test
- Maintainable: Tests should be easy to update when code changes
TDD Best Practices
- One Behavior Per Test: Each test verifies ONE behavior
- Descriptive Names: Test names describe the behavior being tested
- Independent Tests: Tests don't depend on each other
- Fast Tests: Mock external dependencies to keep tests fast
- Clear Assertions: Assertions clearly show what's being verified
Common Mistakes to Avoid
- ❌ Writing multiple tests at once (write one test at a time)
- ❌ Skipping refactor phase (always refactor after green)
- ❌ Implementation before test (delete code and start with test)
- ❌ Over-engineering in GREEN (simplest thing that passes)
- ❌ Writing test that passes immediately (must fail first)
Test Maintenance
- Review and update tests when requirements change
- Remove obsolete tests
- Refactor tests to reduce duplication
- Keep test data factories up to date
- Monitor test execution time
Integration with Other Skills
- debugging: Use when tests fail unexpectedly
- code-review: TDD produces code that's easier to review
- dead-code-removal: Tests help identify unused code
- performance: Use for performance testing strategies
Meta-Principle
TDD is a DESIGN technique, not a testing technique.
The cycle never changes: RED → GREEN → REFACTOR → Repeat
Writing tests first forces you to think about:
- What behavior do I need?
- How will I know it works?
- What's the simplest implementation?
This produces better-designed, more maintainable code.