RL Locomotion | Tharit Sinsunthorn

Here is my research and development of a reinforcement learning-based locomotion policy for quadruped robots. The main focus of this work is to enable adaptable and efficient movement on rough terrain in low-gravity environments, such as the lunar surface environment.

Simulations of the learned locomotion policy on a Unitree Go2 robot in a low-gravity, rough-terrain environment. The robot's gait (blue arrow) follows the commanded velocity (green arrow).

This project utilizes NVIDIA’s Isaac Lab to train a locomotion policy for a quadruped robot using the Proximal Policy Optimization (PPO) algorithm. As shown in the simulations, the robot learns to generate a stable gait (indicated by the blue arrow) that follows the desired command velocity (green arrow), enabling it to navigate challenging, low-gravity terrain.