[Question] Training performance comparsion between IsaacLab and IsaacGym
Hi all,
I wondering if anyone closely compare the performance of the policy trained in IsaacLab and IsaacGym.
I have a decent locomotion policy working well in IsaacGym and try to move the codebase to IsaacLab.
I observe that the training performance in IsaacLab is worse than in IsaacGym.
Although you might claim that I may have introduced a bug while transferring the code, I checked multiple times all the configs and code order etc.. (i.e. PPO settings, reward weights, PhysX settings etc..)
I agree that there might still be some bugs that I didn't identify, but for the sake of questions, let's assume the code logic are exactly the same.
Have anyone observed that their policies in IsaacLab is worse than IsaacGym?
I have some training plots regarding this matter: purple is IsaacGym, and green is IsaacLab.
It seems in IsaacLab the mean reward is lower and value function loss is much higher/noisy than in IsaacGym.
Based on my visual inspection, in IsaacLab the robot tends to reach its torque limit quite frequently and terminate early, resulting in unstable policy and noisy value function.
Does anyone observe similar issue and provide some possible explanation and resolution? (except the bug I might accidently introduced)
One of my hypothesis is the IsaacLab and IsaacGym use different PhysX version and their dynamics (or contact solver) has changed. I'm wondering if there are any hidden bugs or mistakes in IsaacLab side, or it is just finetuning issues.
Thank you!
When I last compared it for ANYmal locomotion, things looked pretty similar. However, this was some time ago, and we don't have regular integration tests for this at the moment.
The value function jumping is the most concerning to me. Are you using one of the environments that we provide, or is this your custom environment?
Hi, I'm using custom environment for humanoid. (Not the unitree robots)
But the custom environment settings are same for both in IsaacGym and IsaacLab.
So, I'm concerned.
Is there any major modification that I need to be aware of? So basically, a few questions:
- Is the physics simulation from PhysX 4.0 (IsaacGym) and PhysX 5.0 (IsaacLab) same?
- Other people have questions, for instance, root_lin_vel_w is not the base frame's velocity rather than CoM's velocity.
- Do PPO settings or reward weights have to be changed as we change the simulator?
A few evidence show that the training performance in IsaacLab is worse than in IsaacGym. What could possibly make this difference?
I don't know if it comes from just fine-tuning issues based on simulator change, or if IsaacLab has some inherent bug or some fundamental changes that need to be dealt with.
Thank you.
- IsaacGym uses PhysX 5.0, and IsaacLab uses PhysX 5.4 - There have been differences/updates made to some of the internals in the solver. However, the difference should be mild.
- From what I know, this has also always been the case in IsaacGym. We have it documented here more appropriately.
- Usually, yes, but not so much in this case since it still uses PhysX. My experience is limited to the robots I have been working with frequently, and for them, I did not need to modify the rewards or PPO parameters. For some manipulation environments, we did have to tune PPO slightly (mainly by increasing the discount factor from 0.99 to 0.995).
There might be slight differences, but they shouldn't lead to catastrophic performance differences. Some things to note/check:
- Your logic of reset, computation of observation, rewards, and MDP signals followed a different order than what we now do in IsaacLab.
- Your asset has different collider properties. In IsaacLab or Omniverse, the collider offsets are set to defaults that PhysX auto-computes later based on geometry. However, in IsaacGym, this was typically set manually.
- Some seed differences - it could be that there are some different internal renderer or physics engine calls that change the "randomness" slightly between the two simulators.
It is easier to comment on your implementation differences with something to reproduce on our side. But I would expect things to not match 100% (though they should be fairly close).
Reiterating my concern: The spikes in your value function appear to happen at frequent intervals, which probably means resets. Something might be off there w.r.t. the MDP signals.
This turns out to be the 'push_robot' problem:
push_robot = EventTerm(
func=mdp.push_by_setting_velocity,
mode="interval",
interval_range_s=(2.5, 2.5),
params={"velocity_range": {"x": (-0.5, 0.5), "y": (-0.5, 0.5)}},
)
If I change this code into (by setting interval_range to be a range)
push_robot = EventTerm(
func=mdp.push_by_setting_velocity,
mode="interval",
interval_range_s=(2.0, 3.0),
params={"velocity_range": {"x": (-0.5, 0.5), "y": (-0.5, 0.5)}},
)
then the value function loss is much less.
Still don't know why this happens or this is crucial ..
Hello, @Mayankm96
I have some serious(?) concern about the push_by_setting_velocity function or possibly any 'interval' events if they are functioning correctly.
As I showed concerns in this post that the training performance in IsaacLab seems to be worse than IsaacGym, I found the following interesting phenomenon.
It seems that the function that push robot in IsaacLab seems to have more degrading effect for training performance than IsaacGym. (Specifically I'm using: IsaacLab/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/mdp/events/push_by_setting_velocity)
To verify, this I conducted two comparison experiment. (see the graphs)
- Remove push_robot in both IsaacLab (Green) and IsaacGym (Red).
- Add
push_function(2.5 seconds, [-0.5m/s, 0.5 m/s] xy root velocity) in both IsaacLab (Blue) and IsaacGym (Pink)
As you can see, the Blue curve (IsaacLab with push_function) tends to have much worse performance. However, if I remove the push_function, the performance match with IsaacGym (as in Green curve).
You may argue that 2.5 seconds interval is too short, but even if I use 10s or 15s, it still shows performance degradation (of course, it converges to "without push_function" as I use longer interval time).
So it brought me an attention that something related to "push_by_setting_velocity" function or 'interval event' might have some issues. However, when I looked into the code, it seems functionally same as the one in IsaacGym.
Here are my questions:
- Do you have any idea why this happens? If you think this is a bug, I would appreciate if you could investigate.
- My another hypothesis is then maybe
asset.write_root_velocity_to_sim(vel_w, env_ids=env_ids)might have an issue? - Or
applyfunction inevent_manager.pymight have an issue? - Or maybe the implementation in IsaacGym was wrong and this one in IsaacLab is correct, and I just have to finetune.
There seems to be many possible explanations, but hope you could give me some guideline here!
Thank you!
@hojae-io
Before we investigate, could you specify the Isaac Lab commit and Isaac Sim version you are using again? It is easier for us to debug if it is the latest release of both (Isaac Lab 1.2 and Isaac Sim 4.2).
Hi, the versions I'm using are: Isaac-Sim 4.1 / IsaacLab 1.1.0
But I believe the code stack for push_by_setting_velocity or 'interval event' hasn't been changed.
Thanks!
Thank you for following up. The team fixed push_by_setting_velocity recently. In general, there shouldn't be drastic differences between Isaac Gym and Isaac Lab.