Agent Skills: logprob-prefill-analysis

Reproduces the full prefill sensitivity analysis pipeline for reward hacking indicators. Use when evaluating how susceptible model checkpoints are to exploit-eliciting prefills, computing token-based trajectories, or comparing logprob vs token-count as predictors of exploitability.

UncategorizedID: aiskillstore/marketplace/logprob-prefill-analysis

Install this agent skill to your local

pnpm dlx add-skill https://github.com/aiskillstore/marketplace/logprob-prefill-analysis

Skill Files

Browse the full folder contents for logprob-prefill-analysis.

Download Skill

Loading file tree…

Select a file to preview its contents.