Back to tags
Tag

Agent Skills with tag: quality-metrics

27 skills match this tag. Use tags to discover related Agent Skills and explore similar workflows.

diagram

Generate publication-quality technical diagrams using Nano Banana Pro (Gemini 3 Pro Image) with AI-powered quality review. Smart iteration only regenerates when quality is below threshold.

diagram-generationgenerative-aigemini-3-propublication-quality
flight505
flight505
0

evaluation-metrics

LLM evaluation frameworks, benchmarks, and quality metrics for production systems.

llm-evaluationevaluation-frameworkbenchmarksquality-metrics
pluginagentmarketplace
pluginagentmarketplace
1

llm-evaluation

|

llmevaluation-pipelinesevaluation-frameworkquality-metrics
phrazzld
phrazzld
21

tech-debt-tracker

Automated technical debt identification, tracking, and prioritization system

technical-debtstatic-analysiscode-smellquality-metrics
benreceveur
benreceveur
31

model-evaluation

Evaluates machine learning models for performance, fairness, and reliability using appropriate metrics and validation techniques. Trigger keywords: model evaluation, metrics, accuracy, precision, recall, F1, ROC, AUC, cross-validation, ML testing.

machine-learningevaluation-pipelinesquality-metricscross-validation
cosmix
cosmix
3

course-description-analyzer

This skill analyzes or creates course descriptions for intelligent textbooks by checking for completeness of required elements (title, audience, prerequisites, topics, Bloom's Taxonomy outcomes) and providing quality scores with improvement suggestions. Use this skill when working with course descriptions in /docs/course-description.md that need validation or creation for learning graph generation.

educational-contentcontent-guidelinesquality-metricsbloom-taxonomy
dmccreary
dmccreary
111

learning-graph-generator

Generates a comprehensive learning graph from a course description, including 200 concepts with dependencies, taxonomy categorization, and quality validation reports. Use this when the user wants to create a structured knowledge graph for educational content.

knowledge-grapheducational-contentontology-designquality-metrics
dmccreary
dmccreary
111

skill-performance-profiler

Analyzes skill usage patterns across conversations to track token consumption, identify heavy vs. lightweight skills, measure invocation frequency, detect co-occurrence patterns, and suggest consolidation opportunities. Use when the user asks to analyze skill performance, optimize skill usage, identify token-heavy skills, find consolidation opportunities, or review skill metrics.

performance-profilingtoken-optimizationconversation-analysisquality-metrics
Exploration-labs
Exploration-labs
72

analyzing-code

Analyzes code statistics by language for project insight, CI/CD metrics, or before refactoring. Use this skill when understanding project composition, measuring change impact, or generating CI/CD metrics

codebase-analysisci-cdquality-metrics
iota9star
iota9star
5

review-feedback-schema

Schema for tracking code review outcomes to enable feedback-driven skill improvement. Use when logging review results or analyzing review quality.

structured-loggingschema-managementquality-metricscode-review
existential-birds
existential-birds
61

review-skill-improver

Analyzes feedback logs to identify patterns and suggest improvements to review skills. Use when you have accumulated feedback data and want to improve review accuracy.

logsanalyticsfeedback-analysisquality-metrics
existential-birds
existential-birds
61

llm-judge

LLM-as-judge methodology for comparing code implementations across repositories. Scores implementations on functionality, security, test quality, overengineering, and dead code using weighted rubrics. Used by /beagle:llm-judge command.

code-reviewquality-metricsllm-evaluationllm-integration
existential-birds
existential-birds
61

validation-standards

Tool usage requirements, failure patterns, consistency checks, and validation methodologies for Claude Code operations

validationstandards-compliancetool-integrationquality-metrics
bejranonda
bejranonda
1111

code-analysis

Provides methodologies, metrics, and best practices for analyzing code structure, complexity, and quality

code-qualityquality-metricsstaticcode-smell
bejranonda
bejranonda
1111

meta-prompt-engineering

Use when prompts produce inconsistent or unreliable outputs, need explicit structure and constraints, require safety guardrails or quality checks, involve multi-step reasoning that needs decomposition, need domain expertise encoding, or when user mentions improving prompts, prompt templates, structured prompts, prompt optimization, reliable AI outputs, or prompt patterns.

prompt-engineeringprompt-refinementchain-of-thoughtquality-metrics
lyndonkl
lyndonkl
82

context-engineering

Master context engineering for AI features - the skill that separates AI products that work from ones that hallucinate. Use when speccing new AI features, diagnosing underperforming AI features, or doing quality checks before shipping. Helps PMs define what context AI needs, where to get it, and what to do when it fails. Based on the 4D Context Canvas framework.

context-engineeringfeature-specificationllm-integrationquality-metrics
breethomas
breethomas
82

satisfaction-feedback

处理用户满意度反馈。用户回复"满意"/"不满意"时,更新 FAQ 使用计数或记录 BADCASE。触发词:满意/不满意/解决了/���解决/谢谢。

feedbackquality-metricsintent-trackingconversation-analysis
Harryoung
Harryoung
151

evaluation

Build evaluation frameworks for agent systems. Use when testing agent performance, validating context engineering choices, or measuring improvements over time.

agent-testingquality-metricsperformance-trackingcontext-engineering
muratcankoylan
muratcankoylan
142

Page 1 of 2 · 27 results