Test Strategy
Overview
Test strategy defines how to approach testing for a project, balancing thoroughness with efficiency. A well-designed strategy ensures critical functionality is covered while avoiding over-testing trivial code. This skill covers the test pyramid, coverage metrics, test categorization, and integration with CI/CD pipelines.
Instructions
1. Design the Test Pyramid
Structure tests in layers with appropriate ratios:
/\
/ \ E2E Tests (5-10%)
/----\ - Critical user journeys
/ \ - Cross-system integration
/--------\ Integration Tests (15-25%)
/ \ - API contracts
/------------\ - Database interactions
/ \ - Service boundaries
/----------------\ Unit Tests (65-80%)
- Business logic
- Pure functions
- Edge cases
Recommended Ratios:
- Unit tests: 65-80% of test suite
- Integration tests: 15-25%
- E2E tests: 5-10%
2. Set Coverage Goals
Coverage Targets by Component Type:
| Component Type | Line Coverage | Branch Coverage | Notes | | -------------- | ------------- | --------------- | ------------------------------ | | Business Logic | 90%+ | 85%+ | Critical paths fully covered | | API Handlers | 80%+ | 75%+ | All endpoints tested | | Utilities | 95%+ | 90%+ | Pure functions easily testable | | UI Components | 70%+ | 60%+ | Focus on behavior over markup | | Infrastructure | 60%+ | 50%+ | Integration tests preferred |
Coverage Anti-patterns to Avoid:
- Chasing 100% coverage for coverage's sake
- Testing getters/setters without logic
- Testing framework or library code
- Writing tests that don't verify behavior
3. Decide What to Test vs What Not to Test
Always Test:
- Business logic and domain rules
- Input validation and error handling
- Security-sensitive operations
- Data transformations
- State transitions
- Edge cases and boundary conditions
- Regression scenarios from bug fixes
Consider Not Testing:
- Simple pass-through functions
- Framework-generated code
- Third-party library internals
- Trivial getters/setters
- Configuration constants
- Logging statements (unless critical)
Test Smell Detection:
// BAD: Testing trivial code
test("getter returns value", () => {
const user = new User("John");
expect(user.getName()).toBe("John");
});
// GOOD: Testing meaningful behavior
test("user cannot change name to empty string", () => {
const user = new User("John");
expect(() => user.setName("")).toThrow(ValidationError);
});
4. Categorize and Organize Tests
Directory Structure:
tests/
├── unit/
│ ├── services/
│ ├── models/
│ └── utils/
├── integration/
│ ├── api/
│ ├── database/
│ └── external-services/
├── e2e/
│ ├── flows/
│ └── pages/
├── fixtures/
│ ├── factories/
│ └── mocks/
└── helpers/
├── setup.ts
└── assertions.ts
Test Tagging System:
// Jest example with tags
describe("[unit][fast] UserService", () => {});
describe("[integration][slow] DatabaseRepository", () => {});
describe("[e2e][critical] CheckoutFlow", () => {});
// Run specific categories
// npm test -- --grep="\[unit\]"
// npm test -- --grep="\[critical\]"
Naming Conventions:
[ComponentName].[scenario].[expected_result].test.ts
Examples:
UserService.createUser.returnsNewUser.test.ts
PaymentProcessor.invalidCard.throwsPaymentError.test.ts
5. Integrate with CI/CD
Pipeline Stage Configuration:
# .github/workflows/test.yml
name: Test Pipeline
on: [push, pull_request]
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Unit Tests
run: npm test -- --grep="\[unit\]" --coverage
- name: Upload Coverage
uses: codecov/codecov-action@v3
integration-tests:
runs-on: ubuntu-latest
needs: unit-tests
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: test
steps:
- uses: actions/checkout@v4
- name: Run Integration Tests
run: npm test -- --grep="\[integration\]"
e2e-tests:
runs-on: ubuntu-latest
needs: integration-tests
steps:
- uses: actions/checkout@v4
- name: Run E2E Tests
run: npm run test:e2e
CI Test Optimization:
- Run unit tests first (fast feedback)
- Parallelize test suites
- Cache dependencies and build artifacts
- Use test splitting for large suites
- Fail fast on critical tests
6. Risk-Based Test Prioritization
Risk Matrix for Prioritization:
| Impact ↓ / Likelihood → | Low | Medium | High | | ----------------------- | --------------- | --------------- | --------------- | | High | Medium Priority | High Priority | Critical | | Medium | Low Priority | Medium Priority | High Priority | | Low | Skip/Manual | Low Priority | Medium Priority |
Risk Factors to Consider:
- Business Impact: Revenue, user trust, legal compliance
- Complexity: Code complexity, integration points
- Change Frequency: Actively developed areas
- Historical Bugs: Components with bug history
- Dependencies: Critical external services
Prioritized Test Categories:
-
Critical (P0): Run on every commit
- Authentication/authorization
- Payment processing
- Data integrity
-
High (P1): Run on PR merge
- Core business workflows
- API contract tests
-
Medium (P2): Run nightly
- Edge cases
- Performance tests
-
Low (P3): Run weekly
- Backward compatibility
- Deprecated feature coverage
7. Domain-Specific Testing Strategies
API Testing Strategy
Test Layers:
-
Contract Tests (P0)
- Request/response schema validation
- HTTP status codes for all endpoints
- Error response formats
- Authentication/authorization rules
-
Business Logic Tests (P0)
- Valid input processing
- Business rule enforcement
- State transitions via API calls
-
Integration Tests (P1)
- Database operations via API
- External service integration
- Transaction rollback scenarios
-
Performance Tests (P2)
- Response time under load
- Concurrent request handling
- Rate limiting behavior
API Test Organization:
tests/api/
├── contracts/ # Schema validation tests
├── endpoints/ # Per-endpoint behavior tests
├── auth/ # Authentication flows
├── integration/ # Cross-service scenarios
└── performance/ # Load and stress tests
Data Pipeline Testing Strategy
Test Focus Areas:
-
Data Quality Tests (P0)
- Schema validation at each stage
- Data type correctness
- Null/missing value handling
- Duplicate detection
-
Transformation Tests (P0)
- Input → output correctness
- Edge case handling
- Data loss detection
- Aggregation accuracy
-
Integration Tests (P1)
- Source extraction correctness
- Sink loading verification
- Idempotency checks
- Failure recovery
-
Performance Tests (P2)
- Processing throughput
- Memory usage with large datasets
- Partition handling
Data Pipeline Test Pattern:
def test_user_data_transformation():
# Arrange: Create test input data
raw_input = create_test_dataset(
rows=1000,
include_nulls=True,
include_duplicates=True
)
# Act: Run transformation
result = transform_user_data(raw_input)
# Assert: Verify output quality
assert_no_nulls(result, required_fields=["user_id", "email"])
assert_no_duplicates(result, key="user_id")
assert_schema_matches(result, UserSchema)
assert len(result) == expected_output_count(raw_input)
ML Model Testing Strategy
Test Layers:
-
Data Validation Tests (P0)
- Feature schema validation
- Label distribution checks
- Data leakage detection
- Train/test split correctness
-
Model Behavior Tests (P0)
- Prediction on known examples
- Invariance tests (e.g., case-insensitive text)
- Directional expectation tests
- Boundary condition handling
-
Model Quality Tests (P1)
- Accuracy/precision/recall thresholds
- Fairness metrics across groups
- Performance on edge cases
- Regression detection (vs baseline)
-
Integration Tests (P1)
- Model loading and serving
- Prediction API contract
- Feature engineering pipeline
- Model versioning
ML Test Example:
def test_sentiment_model_invariance():
"""Model should be case-insensitive"""
model = load_sentiment_model()
test_cases = [
("This is GREAT!", "This is great!"),
("TERRIBLE service", "terrible service"),
]
for text1, text2 in test_cases:
pred1 = model.predict(text1)
pred2 = model.predict(text2)
assert pred1 == pred2, f"Case sensitivity detected: {text1} vs {text2}"
Infrastructure Testing Strategy
Test Focus:
-
Infrastructure-as-Code Tests (P0)
- Syntax validation (terraform validate)
- Security policy checks
- Resource naming conventions
- Cost estimation validation
-
Deployment Tests (P1)
- Smoke tests post-deployment
- Health check endpoints
- Configuration validation
- Rollback procedures
-
Resilience Tests (P2)
- Service restart handling
- Network partition recovery
- Resource exhaustion scenarios
- Chaos engineering tests
-
Observability Tests (P1)
- Metrics collection verification
- Log aggregation correctness
- Alert rule validation
- Dashboard functionality
Infrastructure Test Pattern:
# terraform test example
run "verify_security_group_rules" {
command = plan
assert {
condition = length([for rule in aws_security_group.main.ingress : rule if rule.cidr_blocks[0] == "0.0.0.0/0"]) == 0
error_message = "Security group should not allow ingress from 0.0.0.0/0"
}
}
8. Flaky Test Diagnosis and Prevention
Common Causes of Flakiness:
| Cause | Symptoms | Solution | | ------------------------ | --------------------------------- | ----------------------------------------- | | Race conditions | Fails intermittently on timing | Add proper synchronization | | Async operations | Fails with "element not found" | Use explicit waits, not sleeps | | Shared state | Fails when run with other tests | Isolate test data, reset state | | External dependencies | Fails when service unavailable | Mock external calls, use test doubles | | Time-dependent logic | Fails at specific times/dates | Inject time, use fake clocks | | Resource cleanup | Fails after certain test order | Ensure teardown always runs | | Nondeterministic data | Fails with random data variations | Use fixed seeds, deterministic generators | | Environment differences | Fails in CI but passes locally | Containerize test environment | | Insufficient timeouts | Fails under load/slow machines | Make timeouts configurable | | Parallel execution races | Fails only when parallelized | Use unique identifiers per test |
Flaky Test Diagnosis Workflow:
1. Reproduce Locally
├─ Run test 100 times: `for i in {1..100}; do npm test -- TestName || break; done`
├─ Run with different seeds: `npm test -- --seed=$RANDOM`
└─ Run in parallel: `npm test -- --maxWorkers=4`
2. Identify Pattern
├─ Always fails at same point? → Logic bug, not flaky
├─ Fails under load? → Timing/resource issue
├─ Fails with other tests? → Shared state pollution
└─ Fails on specific data? → Data-dependent bug
3. Instrument Test
├─ Add verbose logging
├─ Capture timing information
├─ Record test environment state
└─ Save failure artifacts (screenshots, logs)
4. Fix Root Cause
├─ Eliminate race conditions
├─ Add proper synchronization
├─ Isolate test state
└─ Mock external dependencies
5. Verify Fix
├─ Run fixed test 1000 times
├─ Run in CI 10 times
└─ Monitor over 1 week
Flaky Test Prevention Checklist:
- [ ] Tests use deterministic test data (fixed seeds, no random())
- [ ] Async operations use explicit waits (not setTimeout/sleep)
- [ ] Tests create unique resources (UUIDs in names/IDs)
- [ ] Cleanup always runs (try/finally, afterEach hooks)
- [ ] No hardcoded timing assumptions (sleep(100) is a code smell)
- [ ] External services are mocked or use test doubles
- [ ] Time-dependent logic uses injected/fake clocks
- [ ] Tests do not depend on execution order
- [ ] Shared state is reset between tests
- [ ] Test environment is reproducible (containerized)
Example: Fixing a Flaky Test
// FLAKY: Race condition with async operation
test("user profile loads", async () => {
renderUserProfile(userId);
// Race: profile might not be loaded yet
expect(screen.getByText("John Doe")).toBeInTheDocument();
});
// FIXED: Proper async handling
test("user profile loads", async () => {
renderUserProfile(userId);
// Wait for async operation to complete
const userName = await screen.findByText("John Doe");
expect(userName).toBeInTheDocument();
});
// FLAKY: Shared state pollution
test("creates user with default role", () => {
const user = createUser({ name: "Alice" });
expect(user.role).toBe("user"); // Fails if previous test modified default
});
// FIXED: Isolated state
test("creates user with default role", () => {
resetDefaultRole(); // Ensure clean state
const user = createUser({ name: "Alice" });
expect(user.role).toBe("user");
});
// FLAKY: Time-dependent logic
test("expires session after 1 hour", () => {
const session = createSession();
// Flaky: Depends on current time
expect(session.expiresAt).toBe(Date.now() + 3600000);
});
// FIXED: Inject time dependency
test("expires session after 1 hour", () => {
const mockClock = installFakeClock();
mockClock.setTime(new Date("2024-01-01T12:00:00Z"));
const session = createSession();
expect(session.expiresAt).toBe(new Date("2024-01-01T13:00:00Z").getTime());
mockClock.uninstall();
});
9. Test Infrastructure Architecture
Test Environment Management:
# docker-compose.test.yml
version: "3.8"
services:
test-db:
image: postgres:15
environment:
POSTGRES_DB: test_db
POSTGRES_USER: test_user
POSTGRES_PASSWORD: test_pass
ports:
- "5433:5432"
tmpfs:
- /var/lib/postgresql/data # In-memory for speed
test-redis:
image: redis:7-alpine
ports:
- "6380:6379"
test-app:
build: .
environment:
DATABASE_URL: postgres://test_user:test_pass@test-db:5432/test_db
REDIS_URL: redis://test-redis:6379
depends_on:
- test-db
- test-redis
Test Data Management:
// Factory pattern for test data
class UserFactory {
private sequence = 0;
create(overrides?: Partial<User>): User {
return {
id: overrides?.id ?? `user-${this.sequence++}`,
email: overrides?.email ?? `user${this.sequence}@test.com`,
name: overrides?.name ?? `Test User ${this.sequence}`,
role: overrides?.role ?? "user",
createdAt: overrides?.createdAt ?? new Date(),
};
}
createBatch(count: number, overrides?: Partial<User>): User[] {
return Array.from({ length: count }, () => this.create(overrides));
}
}
// Usage ensures unique data per test
test("user search works", () => {
const factory = new UserFactory();
const users = factory.createBatch(10);
// Each test gets unique users, no conflicts
});
Test Parallelization Strategy:
| Strategy | When to Use | Configuration |
| -------------------- | ------------------------------------ | -------------------------------------------- |
| File-level parallel | Tests in different files independent | Jest: --maxWorkers=4 |
| Database per worker | Tests need database isolation | Postgres: Create schema per worker |
| Test sharding | CI with multiple machines | Split tests by shard: --shard=1/4 |
| Test prioritization | Want fast feedback | Run fast tests first, slow tests in parallel |
| Smart test selection | Only run affected tests | Use dependency graph to select changed tests |
Example: Parallel Test Configuration
// jest.config.js with parallel optimization
module.exports = {
maxWorkers: process.env.CI ? "50%" : "75%", // Conservative in CI
testTimeout: 30000, // Longer timeout for CI
// Run fast tests first
testSequencer: "./custom-sequencer.js",
// Database isolation per worker
globalSetup: "./tests/setup/create-test-dbs.js",
globalTeardown: "./tests/setup/drop-test-dbs.js",
// Shard tests in CI
shard: process.env.CI_NODE_INDEX
? `${process.env.CI_NODE_INDEX}/${process.env.CI_NODE_TOTAL}`
: undefined,
};
Test Optimization Techniques:
-
Reduce Test Startup Time
- Cache compiled code
- Lazy-load test dependencies
- Use in-memory databases for unit tests
-
Optimize Test Execution
- Batch database operations
- Reuse expensive fixtures (connections, containers)
- Skip unnecessary setup for focused tests
-
Parallelize Safely
- Unique identifiers per test (UUIDs)
- Separate database schemas per worker
- Avoid shared file system access
-
Smart Test Selection
- Run only affected tests during development
- Use coverage mapping to determine affected tests
- Cache test results for unchanged code
# Run only tests affected by changes
npm test -- --changedSince=origin/main
# Run tests for specific module and dependents
npm test -- --selectProjects=user-service --testPathPattern=user
# Watch mode with smart re-running
npm test -- --watch --changedSince=HEAD
Best Practices
-
Test Behavior, Not Implementation
- Tests should verify outcomes, not internal mechanics
- Refactoring should not break tests if behavior unchanged
-
Keep Tests Independent
- No shared mutable state between tests
- Each test sets up its own context
- Tests can run in any order
-
Use Test Doubles Appropriately
- Stubs for providing test data
- Mocks for verifying interactions
- Fakes for complex dependencies
- Real implementations when feasible
-
Maintain Test Quality
- Apply same code quality standards to tests
- Refactor test code for readability
- Remove obsolete tests promptly
-
Fast Feedback Loop
- Optimize for quick local test runs
- Use watch mode during development
- Prioritize fast tests in CI
-
Document Test Intent
- Clear test names describe behavior
- Add comments for non-obvious setup
- Link tests to requirements/tickets
Examples
Example: Feature Test Strategy Document
# Feature: User Registration
## Risk Assessment
- Business Impact: HIGH (user acquisition)
- Complexity: MEDIUM (email validation, password rules)
- Change Frequency: LOW (stable feature)
## Test Coverage Plan
### Unit Tests (P0)
- [ ] Email format validation
- [ ] Password strength requirements
- [ ] Username uniqueness check logic
- [ ] Profile data sanitization
### Integration Tests (P1)
- [ ] Database user creation
- [ ] Email service integration
- [ ] Duplicate email handling
### E2E Tests (P0)
- [ ] Happy path: complete registration flow
- [ ] Error path: duplicate email shows error
## Coverage Targets
- Line coverage: 85%
- Branch coverage: 80%
- Critical paths: 100%
Example: Test Organization Configuration
// jest.config.js
module.exports = {
projects: [
{
displayName: "unit",
testMatch: ["<rootDir>/tests/unit/**/*.test.ts"],
setupFilesAfterEnv: ["<rootDir>/tests/helpers/unit-setup.ts"],
},
{
displayName: "integration",
testMatch: ["<rootDir>/tests/integration/**/*.test.ts"],
setupFilesAfterEnv: ["<rootDir>/tests/helpers/integration-setup.ts"],
globalSetup: "<rootDir>/tests/helpers/db-setup.ts",
globalTeardown: "<rootDir>/tests/helpers/db-teardown.ts",
},
],
coverageThreshold: {
global: {
branches: 75,
functions: 80,
lines: 80,
statements: 80,
},
"./src/services/": {
branches: 90,
lines: 90,
},
},
};
Example: Risk-Based Test Selection Script
// scripts/select-tests.ts
interface TestFile {
path: string;
priority: "P0" | "P1" | "P2" | "P3";
tags: string[];
}
function selectTestsForPipeline(
context: "commit" | "pr" | "nightly" | "weekly",
): TestFile[] {
const allTests = getTestManifest();
const priorityMap = {
commit: ["P0"],
pr: ["P0", "P1"],
nightly: ["P0", "P1", "P2"],
weekly: ["P0", "P1", "P2", "P3"],
};
return allTests.filter((test) =>
priorityMap[context].includes(test.priority),
);
}