Agent Skills: megatron-memory-estimator
Estimate GPU memory usage for Megatron-based MoE (Mixture of Experts) and dense models. Use when users need to (1) estimate memory from HuggingFace model configs (DeepSeek-V3, Qwen, etc.), (2) plan GPU resource allocation for training, (3) compare different parallelism strategies (TP/PP/EP/CP), (4) determine if a model fits in available GPU memory, or (5) optimize training configurations for memory efficiency.
UncategorizedID: yzlnew/infra-skills/megatron-memory-estimator
Install this agent skill to your local
Skill Files
Browse the full folder contents for megatron-memory-estimator.
Loading file tree…
Select a file to preview its contents.