IsaacLab [Question] Training speed of Orbit compared to IsaacGym

Hi all,

I was using IsaacGym for my humanoid learning before and have recently moved to Orbit.

One thing that has been bothering me for a while is that I observe that the training speed in Orbit is about 40% slower than in IsaacGym.

Here are some specification for comparing training speed of Orbit and IsaacGym (legged_gym in RSL). Robot type: Humanoid / Using same URDF on both sides

Parameters	Orbit	IsaacGym
num_steps_per_env	24	24
sim_dt	0.001 [s]	0.001 [s]
decimation	10	10
collection time / iteration	1.38 [s]	0.85 [s]

Although I wouldn't mention every parameters here, I double checked that almost all params ( physx settings, runner settings etc.) are same.

While I was tracking down where the difference is originating from, I ended up at simulator update code in decimation for-loop.

Here's the Orbit's simulator update code in rl_task_env.py:

for _ in range(self.cfg.decimation):
    # set actions into buffers
    self.action_manager.apply_action()   # Line 1
    # set actions into simulator
    self.scene.write_data_to_sim()  # Line 2
    # simulate
    self.sim.step(render=False)  # Line 3
    # update buffers at sim dt
    self.scene.update(dt=self.physics_dt)  # Line 4

Here's the IsaacGym's simulator update code in legged_robot.py:

for _ in range(self.cfg.control.decimation):
    self.torques = self._compute_torques(self.actions).view(self.torques.shape)  # Line 1
    self.gym.set_dof_actuation_force_tensor(self.sim, gymtorch.unwrap_tensor(self.torques))  # Line 2
    self.gym.simulate(self.sim)  # Line 3
    if self.device == 'cpu':
        self.gym.fetch_results(self.sim, True)
    self.gym.refresh_dof_state_tensor(self.sim)  # Line 4

As you might see, each Line 1,2,3,4 is functionally corresponding. So I measured the execution time of each Line for line-by-line comparison and here's the results.

Line	Orbit [s]	IsaacGym [s]
Line 1	0.00027	0.00186
Line 2	0.00417	0.00076
Line 3	0.03805	0.03389
Line 4	0.01720	0.00065
Total	0.05983	0.03725

As you can see, multiplying the difference of the total time by num_steps_per_env ((0.05983 - 0.03725) x 24 = 0.542 s) explains why the Orbit is slower roughly 40% than IsaacGym in my case.

Here are my questions:

Is Orbit slower than IsaacGym by design? If so, 40% seems a quite of slow down. I'm wondering if this will be improved in the future.
If not (if the training speed should be more or less same), I'm wondering what I am missing here. From my understanding, Line 3 and Line 4 have significantly slowed down. Line 3 is related to physx update and Line 4 is related to self.scene where we update the robot articulation and the sensors. I would appreciate if you can guide me if I'm missing any important parameter setting to make Orbit faster.

Thank you for the help in advance! Best,

Apr 22 '24 03:04 hojae-io

I have noticed the same thing. When running legged gym with the default anymal c rough terrain task, the iteration time is around 1.1 seconds while the same task in Isaac Lab is at least 2 seconds per iteration or more. According to the printout and my summaries it seems like most of the increase is in the collection time, not the learning time, so I also suspect it is an issue with orbit/isaac lab. Note I was running in a containerized cluster environment so I know i had the same hardware and software as the default config. As far as i can tell the tasks are the same in legged_gym and in isaac lab, so there shouldn't be an increase like this. This nearly doubles the training time, so I would really appreciate if this could be investigated.

Jul 28 '24 05:07 inkyant

Yes, some slowdown is expected from the legged-gym as there are changes in the environment's implementation. Out of my mind, these are as follows:

Using a warp-based ray-caster sensor instead of height-field-based lookup: The ray-cast sensor allows different patterns than just a height-scan, which is generally more useful. However, the ray-cast operation is more expensive than a simple tensor look-up
Modualirzation in the manager-based environment: To make it easier to configure the environment in various ways, we decomposed it into several managers. This does add some overhead to the environment step (~5% when I last investigated). However, it shouldn't be the main factor for the slow-down.

Recently, we updated the framework to use lazy-tensors, so we hope that reduces some of the performance gaps you see.

Jul 29 '24 10:07 Mayankm96

Interesting, thank you for the timely response. Would you recommend I disable the ray casting calculations somehow? Or is that not possible?

Jul 31 '24 02:07 inkyant