Agent Skills: eval-harness

Eval-driven development (EDD) — define pass/fail criteria before coding, measure with pass@k metrics. Use when defining completion criteria or measuring agent reliability.

UncategorizedID: xbklairith/kisune/eval-harness

Author

xbklairith

https://github.com/xbklairith View all skills

Repository

xbklairith/kisune

xbklairith

Install this agent skill to your local

pnpm dlx add-skill https://github.com/xbklairith/kisune/eval-harness

Skill Files

Browse the full folder contents for eval-harness.

Download Skill

Loading file tree…

Select a file to preview its contents.