brax
brax copied to clipboard
How to apply APG algorithm in a Brax pipeline environment for SMPLH-based MJCF?
Hello there, thanks for amazing work on Brax!
I'm a PhD student in computer vision and machine learning. My current research is motion generation using SMPLH (SMPL + MANO) based humanoid robots. This text is going to be a bit long, so I'm sorry in advance.
I want to use APG to train a policy that mimics motion capture data like AMASS (SMPLH-based motion dataset) well. Of course, there are other algorithms for motion generation, but I wanted to train a policy quickly using APG without GAN-based rewards or PPOs.
First, I'd like to talk about my background and the problem I'm facing. Before using Brax, I used Isaac Gym as a physics simulation and built an environment based on SMPLH-based robots based on this physics engine. I applied the AMP algorithm, which uses GAN-based rewards for SFU data from AMASS, to this environment and successfully learned policies that mimic SFU motion dataset. However, when I applied the MJCF file used by Isaac Gym to Brax based on the Positional backend and ran the APG algorithm, it failed to mimic the motion data.
I tried a few things but couldn't fix it. Here are the issues I observed using Brax on SMPLH-based robot. The APG algorithm visualization was different from the original data. The robot appears to have no movement at all.
-
[Original data visualization]
-
[APG policy visualization]
Second, here's what I tried to do to resolve the issue. The MJCF file that used in Isaac Gym caused NaN values when training APG, but I noticed that the NaN values went away after trying 1st thing of the following things.
- Add “<custom> ~ </custom>” provided by https://github.com/google/brax/blob/main/brax/envs/assets/humanoid.xml into MJCF.
- Adjust joint’s stiffness, damping, and armature.
- Set gear value of actuator to 1.
- Adjust sys.actuactor.ctrlrange and sys.actuator.forcerange
I used an MJCF file for the SMPLH robot built with SMPLSim, which works for Isaac Gym and Mujoco. sys.actuator.ctrlrange, sys.actuator.forcerange, and position and velocity gains for PD control were based on the parameters used in SMPLSim. I'm not an expert in robotics and I may have adjusted them incorrectly. But, when I adjusted these parameters, as well as the stiffness, damping, and the armature, I got the same phenomena. Also, I tried MJX as a backend, but it was slow to train APG model and didn't help me as much as a positional backend. I had no issues running the APG tutorial using robot dog MJCF, so I think there's something wrong with the MJCF settings or the environment.
Any advice or hints on how to fix this? Also, are there any references or comments on what the values set inside “<custom> ~ </custom>” in https://github.com/google/brax/blob/main/brax/envs/assets/humanoid.xml mean and affect the simulation?
If the MJCF file I used is causing the problem, I thought I could try with a config file suitable for Brax.v1. However, when I run the Mujoco converter provided by Brax.v1.tools to convert the MJCF file to a config file for Brax.v1, I get the following error. How can I solve this problem?
I also attached the implementation I used as the below. Note that this code has dependencies on Jax CUDA and the Pytorch CPU.
The files in the compressed file have the following meaning.
- Motion file: 0005_Jogging001_poses_smplh.npy
- MJCF file for SMPLH robot: smplh_humanoid_smplsim_brax.xml
- Environment for SMPLH robot: smplh_humanoid_env.py
- APG training: run_apg_algorithm.py
- Convert MJCF to config: convert_mjcf_to_config.py
I've bolded the questions I'm curious about. Thank you for reading this long text.