Back to tags
Tag

Agent Skills with tag: agent-testing

18 skills match this tag. Use tags to discover related Agent Skills and explore similar workflows.

code-reviewer

Use when explicitly asked to run the code-reviewer subagent or when another skill requires the code-reviewer agent card.

code-reviewagent-testingcode-qualitycode-analysis
troykelly
troykelly
1

sf-ai-agentforce-testing

>

agent-testingai-agentsautomationsf-ai-agentforce
Jag Valaiyapathy
Jag Valaiyapathy
214

claude-extensibility

Claude Code extensibility: agents, skills, output styles. Capabilities: create/update/delete agents and skills, YAML frontmatter, system prompts, tool/model selection, resumable agents, CLI-defined agents. Actions: create, edit, delete, optimize, test extensions. Keywords: agent, skill, output-style, SKILL.md, subagent, Task tool, progressive disclosure. Use when: creating agents/skills, editing extensions, configuring tool access, choosing models, testing activation.

agentextension-developmentyaml-frontmattercli
samhvw8
samhvw8
2

spawning-agents-on-the-command-line

Use when subagents need to delegate work but can't use Task tool, or when needing to test skills in isolated context - spawns Claude instances via CLI backgrounding with JSON responses

spawningclibackground-processingjson
snits
snits
12

testing-skills-with-subagents

Use when creating or editing skills, before deployment, to verify they work under pressure and resist rationalization - applies RED-GREEN-REFACTOR cycle to process documentation by running baseline without skill, writing to address failures, iterating to close loopholes

red-green-refactoragent-testingprocess-documentationstress-testing
NickCrew
NickCrew
52

testing-skills-with-subagents

Use when creating or editing skills, before deployment, to verify they work under pressure and resist rationalization - applies RED-GREEN-REFACTOR cycle to process documentation by running baseline without skill, writing to address failures, iterating to close loopholes

red-green-refactorprocess-documentationagent-testingtest-automation
galihcitta
galihcitta
3

determinism

Use when verifying outcomes with code instead of LLM judgment, versioning prompts with hashes, or ensuring reproducible agent behavior. Load for any critical verification. Scripts return boolean exit codes, not subjective assessments. Prompts use semantic versioning with SHA256 validation.

reproducibilityversioningagent-testingdeterminism
ingpoc
ingpoc
5

Convex Agents Debugging

Troubleshoots agent behavior, logs LLM interactions, and inspects database state. Use this when responses are unexpected, to understand context the LLM receives, or to diagnose data issues.

logstroubleshootingagent-testinglarge-language-models
Sstobo
Sstobo
81

Convex Agents Playground

Sets up a web UI for testing, debugging, and developing agents without code. Use this to manually test agents, browse conversation history, and verify behavior in real-time.

agent-testingautonomous-agentweb-uidebugging
Sstobo
Sstobo
81

testing-workflows-with-subagents

Use when creating or editing commands, orchestrator prompts, or workflow documentation before deployment - applies RED-GREEN-REFACTOR to test instruction clarity by finding real execution failures, creating test scenarios, and verifying fixes with subagents

red-green-refactoragent-testingagent-orchestrationtest-case-generation
arittr
arittr
6

self-test-skill-invocation

Use when user asks to "test skill invocation framework" or mentions "canary skill test". This is a self-test skill to verify the test framework correctly loads and invokes skills.

skill-invocationagent-testingtesting-patternstest-automation
konflux-ci
konflux-ci
64

pydantic-ai-testing

Test PydanticAI agents using TestModel, FunctionModel, VCR cassettes, and inline snapshots. Use when writing unit tests, mocking LLM responses, or recording API interactions.

pydantic-aiunit-testingagent-testingmocking-strategies
existential-birds
existential-birds
61

llm-artifacts-detection

Detects common LLM coding agent artifacts in codebases. Identifies test quality issues, dead code, over-abstraction, and verbose LLM style patterns. Use when cleaning up AI-generated code or reviewing for agent-introduced cruft.

agent-testingcode-smellai-detectiondead-code
existential-birds
existential-birds
61

pydantic-ai-common-pitfalls

Avoid common mistakes and debug issues in PydanticAI agents. Use when encountering errors, unexpected behavior, or when reviewing agent implementations.

pydantic-aiagent-testingerror-handlingtroubleshooting
existential-birds
existential-birds
61

evaluation

Build evaluation frameworks for agent systems. Use when testing agent performance, validating context engineering choices, or measuring improvements over time.

agent-testingquality-metricsperformance-trackingcontext-engineering
muratcankoylan
muratcankoylan
142

writing-skills

Use when creating new skills, editing existing skills, or verifying skills work - applies TDD to documentation by testing with subagents before writing

skill-authoringagent-testingtest-driven-developmentdocumentation
withzombies
withzombies
161

sf-ai-agentforce-testing

>

salesforceagent-testingai-modelstest-automation
Jag Valaiyapathy
Jag Valaiyapathy
213

evaluation

This skill should be used when the user asks to "evaluate agent performance", "build test framework", "measure agent quality", "create evaluation rubrics", or mentions LLM-as-judge, multi-dimensional evaluation, agent testing, or quality gates for agent pipelines.

autonomous-agentagent-testingevaluation-frameworkquality-gates
muratcankoylan
muratcankoylan
5,808463