Agent Skills: logprob-prefill-analysis
Reproduces the full prefill sensitivity analysis pipeline for reward hacking indicators. Use when evaluating how susceptible model checkpoints are to exploit-eliciting prefills, computing token-based trajectories, or comparing logprob vs token-count as predictors of exploitability.
UncategorizedID: aiskillstore/marketplace/logprob-prefill-analysis
23014
Install this agent skill to your local
Skill Files
Browse the full folder contents for logprob-prefill-analysis.
Loading file tree…
Select a file to preview its contents.