Agent Skills: knowledge-distillation
Compress large language models using knowledge distillation from teacher to student models. Use when deploying smaller models with retained performance, transferring GPT-4 capabilities to open-source models, or reducing inference costs. Covers temperature scaling, soft targets, reverse KLD, logit distillation, and MiniLLM training strategies.
UncategorizedID: davila7/claude-code-templates/knowledge-distillation
19,6461,834
Install this agent skill to your local
Skill Files
Browse the full folder contents for knowledge-distillation.
Loading file tree…
Select a file to preview its contents.