Agent Skills: KSIM-RL Skill

RL training library for humanoid locomotion and manipulation built on MuJoCo and JAX. Provides PPO, AMP, and custom task abstractions for sim-to-real robotics policy training.

UncategorizedID: plurigrid/asi/ksim-rl

Install this agent skill to your local

pnpm dlx add-skill https://github.com/plurigrid/asi/tree/HEAD/plugins/asi/skills/ksim-rl

Skill Files

Browse the full folder contents for ksim-rl.

Download Skill

Loading file tree…

plugins/asi/skills/ksim-rl/SKILL.md

Skill Metadata

Name
ksim-rl
Description
RL training library for humanoid locomotion and manipulation built on MuJoCo and JAX. Provides PPO, AMP, and custom task abstractions for sim-to-real robotics policy training.

KSIM-RL Skill

Trit: -1 (MINUS - analysis/verification) Color: #3A2F9E (Deep Purple) URI: skill://ksim-rl#3A2F9E

Overview

KSIM is K-Scale Labs' reinforcement learning library for humanoid robot locomotion and manipulation. Built on MuJoCo for physics simulation and JAX for hardware-accelerated training.

Core Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        KSIM ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │  RLTask     │  │  PPOTask    │  │  AMPTask                │  │
│  │  (abstract) │──│  (PPO impl) │──│  (Adversarial Motion)   │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│         │                                                        │
│         ▼                                                        │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │                    PhysicsEngine                             │ │
│  │  ┌───────────────┐  ┌───────────────────────────────┐       │ │
│  │  │ MujocoEngine  │  │ MjxEngine (JAX-accelerated)   │       │ │
│  │  └───────────────┘  └───────────────────────────────┘       │ │
│  └─────────────────────────────────────────────────────────────┘ │
│         │                                                        │
│         ▼                                                        │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │  Environment Components                                      │ │
│  │  • Actuators: Position, Velocity, Torque control            │ │
│  │  • Observations: Joint states, IMU, local view              │ │
│  │  • Rewards: Velocity tracking, gait, energy, stability      │ │
│  │  • Terminations: Fall detection, boundary violations        │ │
│  └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

Key Features

  • JAX-Accelerated: Uses MJX for parallel environment simulation on GPU/TPU
  • PPO Training: Proximal Policy Optimization with configurable hyperparameters
  • AMP Support: Adversarial Motion Priors for realistic humanoid locomotion
  • Modular Rewards: Composable reward functions for gait, velocity, energy
  • Domain Randomization: Built-in randomizers for sim-to-real transfer

API Usage

import ksim
from ksim import PPOTask, MjxEngine
from ksim.tasks.humanoid import HumanoidWalkingTask

# Define custom task
class KBotWalkingTask(PPOTask):
    model_path = "kbot.mjcf"
    
    # Observations
    observations = [
        ksim.JointPosition(),
        ksim.JointVelocity(),
        ksim.IMUAngularVelocity(),
        ksim.BaseOrientation(),
    ]
    
    # Rewards
    rewards = [
        ksim.LinearVelocityReward(scale=1.0),
        ksim.GaitPhaseReward(scale=0.5),
        ksim.EnergyPenalty(scale=-0.01),
    ]
    
    # Actuators
    actuators = [
        ksim.PositionActuator(
            joint_name=".*",
            kp=100.0,
            kd=10.0,
            action_scale=0.5,
        )
    ]

# Train
task = KBotWalkingTask()
task.run_training(
    num_envs=4096,
    num_steps=1000000,
    learning_rate=3e-4,
)

GF(3) Triads

This skill participates in balanced triads:

ksim-rl (-1) ⊗ kos-firmware (+1) ⊗ mujoco-scenes (0) = 0 ✓
ksim-rl (-1) ⊗ kos-firmware (+1) ⊗ urdf2mjcf (0) = needs balancing

Key Contributors

  • codekansas (Ben Bolte): Core architecture, PPO, rewards
  • b-vm: Randomizers, disturbances, policy training
  • carlosdp: Adaptive KL, action scaling
  • WT-MM: Visualization, markers

Related Skills

  • kos-firmware (+1): Robot firmware and gRPC services
  • mujoco-scenes (0): Scene composition for MuJoCo
  • evla-vla (-1): Vision-language-action models
  • urdf2mjcf (-1): URDF to MJCF conversion
  • ktune-sim2real (-1): Servo tuning for sim2real

References

@misc{ksim2024,
  title={K-Sim: RL Training for Humanoid Locomotion},
  author={K-Scale Labs},
  year={2024},
  url={https://github.com/kscalelabs/ksim}
}