check | Agent Skills

llm_evaluation

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

[accuracyagentsalgorithmsartificial

vuralserhat86

4212

skill_evaluator

Evaluates agent skills against Anthropic's best practices. Use when asked to review, evaluate, assess, or audit a skill for quality. Analyzes SKILL.md structure, naming conventions, description quality, content organization, and identifies anti-patterns. Produces actionable improvement recommendations.

[architectureauditautomationbest

vuralserhat86

4212

Agent Skills with tag: check

llm_evaluation

skill_evaluator