Agent Skills: eval-driven-dev
Add instrumentation, build golden datasets, write eval-based tests, run them, root-cause failures, and iterate — Ensure your Python LLM application works correctly. Make sure to use this skill whenever a user is developing, testing, QA-ing, evaluating, or benchmarking a Python project that calls an LLM. Use for making sure an LLM application works correctly, catching regressions after prompt changes, fixing unexpected behavior, or validating output quality before shipping.
UncategorizedID: github/awesome-copilot/eval-driven-dev
26,8353,105
Install this agent skill to your local
Skill Files
Browse the full folder contents for eval-driven-dev.
Loading file tree…
Select a file to preview its contents.