IsaacLab icon indicating copy to clipboard operation
IsaacLab copied to clipboard

[Bug Report] USD matrix4d Error

Open Atticlmr opened this issue 8 months ago • 9 comments

Describe the bug

while training , I met this warning report 2025-03-27 12:58:57 [89,017ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

Steps to reproduce

my code is available at https://github.com/Atticlmr/Quadcopter_Lab using this command /isaaclab.sh -p scripts/reinforcement_learning/skrl/train.py
--task=Isaac-Quadcopterend2end-Direct-v0
--headless
--enable_cameras
--num_envs 80\

2025-03-27 13:13:05 [41,203ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-27 13:13:05 [41,203ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-27 13:13:05 [41,203ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-27 13:13:05 [41,203ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-27 13:13:05 [41,203ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-27 13:13:05 [41,203ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

For more information on this, check: https://www.markdownguide.org/extended-syntax/#fenced-code-blocks

-->

System Info

Describe the characteristic of your environment:

  • Commit: [e.g. 8f3b9ca]
  • Isaac Sim Version: [4.5]
  • OS: [e.g. Ubuntu 22.04]
  • GPU: [e.g. RTX 2080ti]
  • CUDA: [e.g. 12.4]
  • GPU Driver: [e.g. 560, this can be seen by using nvidia-smi command.]

Atticlmr avatar Mar 27 '25 13:03 Atticlmr

This warning occurs approximately once every 22 training episodes.

Atticlmr avatar Mar 27 '25 13:03 Atticlmr

I changed tiled camera into camera,It seems likeCamera Pose Calculation Failure Due to Non-Orthonormal Transform Matrix

2025-03-28 11:47:13 [35,664ms] [Warning] [omni.hydra.scene_delegate.plugin] Calling getBypassRenderSkelMeshProcessing for prim /Visuals/Command/goal_position.proto0_mesh_id0 that has not been populated 0%|▎ | 24/20000 [00:03<1:03:38, 5.23it/s]2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning: in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-03-28 11:47:18 [40,091ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 495 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/isaacsim/extscache/omni.kit.pip_archive-0.0.0+d02c707b.lx64.cp310/pip_prebundle/numpy/linalg/linalg.py:2180: RuntimeWarning: invalid value encountered in det r = _umath_linalg.det(a, signature=signature) 0%|▎ | 24/20000 [00:03<51:09, 6.51it/s] Error executing job with overrides: [] Traceback (most recent call last): File "/home/li/Desktop/Quadcopter_Lab/source/isaaclab_tasks/isaaclab_tasks/utils/hydra.py", line 101, in hydra_main func(env_cfg, agent_cfg, *args, **kwargs) File "/home/li/Desktop/Quadcopter_Lab/scripts/reinforcement_learning/skrl/train.py", line 207, in main runner.run() File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/skrl/utils/runner/torch/runner.py", line 496, in run self._trainer.train() File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/skrl/trainers/torch/sequential.py", line 86, in train self.single_agent_train() File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/skrl/trainers/torch/base.py", line 196, in single_agent_train next_states, rewards, terminated, truncated, infos = self.env.step(actions) File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/skrl/envs/wrappers/torch/isaaclab_envs.py", line 62, in step observations, reward, terminated, truncated, self._info = self._env.step(actions) File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/gymnasium/wrappers/common.py", line 393, in step return super().step(action) File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/gymnasium/core.py", line 327, in step return self.env.step(action) File "/home/li/Desktop/Quadcopter_Lab/source/isaaclab/isaaclab/envs/direct_rl_env.py", line 363, in step self.reward_buf = self._get_rewards() File "/home/li/Desktop/Quadcopter_Lab/source/isaaclab_tasks/isaaclab_tasks/direct/quadcopterend2end/quadcopterend2end_env.py", line 384, in _get_rewards depth_image = self._camera.data.output["depth"] # (num_envs, H, W) File "/home/li/Desktop/Quadcopter_Lab/source/isaaclab/isaaclab/sensors/camera/camera.py", line 180, in data self._update_outdated_buffers() File "/home/li/Desktop/Quadcopter_Lab/source/isaaclab/isaaclab/sensors/sensor_base.py", line 289, in _update_outdated_buffers self._update_buffers_impl(outdated_env_ids) File "/home/li/Desktop/Quadcopter_Lab/source/isaaclab/isaaclab/sensors/camera/camera.py", line 496, in _update_buffers_impl self._update_poses(env_ids) File "/home/li/Desktop/Quadcopter_Lab/source/isaaclab/isaaclab/sensors/camera/camera.py", line 617, in _update_poses poses, quat = self._view.get_world_poses(env_ids) File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/isaacsim/exts/isaacsim.core.prims/isaacsim/core/prims/impl/xform_prim.py", line 705, in get_world_poses positions[write_idx], orientations[write_idx] = get_world_pose(self._prim_paths[i]) File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/isaacsim/exts/isaacsim.core.utils/isaacsim/core/utils/xforms.py", line 173, in get_world_pose r = Rotation.from_matrix(result_transform[:3, :3]) File "_rotation.pyx", line 1144, in scipy.spatial.transform._rotation.Rotation.from_matrix File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/isaacsim/extscache/omni.kit.pip_archive-0.0.0+d02c707b.lx64.cp310/pip_prebundle/numpy/linalg/linalg.py", line 1681, in svd u, s, vh = gufunc(a, signature=signature, extobj=extobj) File "/home/li/miniconda3/envs/env_isaaclab/lib/python3.10/site-packages/isaacsim/extscache/omni.kit.pip_archive-0.0.0+d02c707b.lx64.cp310/pip_prebundle/numpy/linalg/linalg.py", line 121, in _raise_linalgerror_svd_nonconvergence raise LinAlgError("SVD did not converge") numpy.linalg.LinAlgError: SVD did not converge

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Annotators' for removal 2025-03-28 11:47:18 [40,229ms] [Warning] [omni.graph.core.plugin] Could not find category 'Replicator:Core' for removal 2025-03-28 11:47:18 [40,232ms] [Warning] [omni.graph.core.plugin] Could not find category 'animation' for removal 2025-03-28 11:47:18 [40,438ms] [Warning] [carb] Client omni.syntheticdata.plugin Failed to acquire interface [omni::graph::core::INode v4.10] while unloading all plugins

Atticlmr avatar Mar 28 '25 11:03 Atticlmr

using SKRL for training you will got this problem Using RSL-RL for no BUG

Atticlmr avatar Mar 28 '25 12:03 Atticlmr

Thank you for posting this. Is your training session crashing? This may be related to NaNs being caught up in the process. The team will investigate.

RandomOakForest avatar Apr 01 '25 16:04 RandomOakForest

Following up, this seems to be the case of SKRL landing in a bad state. Are you resetting the camera poses?

RandomOakForest avatar Apr 01 '25 21:04 RandomOakForest

Following up, this seems to be the case of SKRL landing in a bad state. Are you resetting the camera poses?

I haven't reset the camera poses.The camera poses are set as follows in the code.

tiled_camera: TiledCameraCfg = TiledCameraCfg(
prim_path="/World/envs/env_.*/Robot/body/camera",
offset=TiledCameraCfg.OffsetCfg(pos=(0.0, 0.0, 0.01), rot=(0.5, -0.5, 0.5, -0.5), convention="ros"),
data_types=["depth"],
spawn=sim_utils.PinholeCameraCfg(
    focal_length=24.0, focus_distance=400.0, horizontal_aperture=20.955, clipping_range=(0.1, 20.0)
),
width=80,
height=80,)

Atticlmr avatar Apr 02 '25 07:04 Atticlmr

Thank you for posting this. Is your training session crashing? This may be related to NaNs being caught up in the process. The team will investigate.

The training can always run,but after visualization,it turns out that the camera only sees a completely dark scene.The data obtained by the camera is meaningless.

Atticlmr avatar Apr 02 '25 07:04 Atticlmr

I'm facing the same problem with skrl, there is a way to solve this?

AndreaRossetto avatar Apr 10 '25 12:04 AndreaRossetto

I'm facing the same problem with skrl, there is a way to solve this?

  1. you can use rsl_rl or sb3 for training , its no problem 2.do not use the scripts /scripts/reinforcement_learning/skrl/train.py, write a python script using skirl.ppo() for training instead

you can choose a way above to avoid this problem

Atticlmr avatar Apr 21 '25 08:04 Atticlmr

I get the same issue with rsl_rl when using multi-gpu training and tiled cameras in my task. It happens randomly after 2500 iterations the latest so far was at about 5100 iterations, so it is very hard to reproduce. I am on ubuntu 20.04, Sim version 4.5, 8*4090 cards. Running the same task on a single card has not produced the error so far

Alkrick avatar May 23 '25 03:05 Alkrick

Thank you for posting this. Is your training session crashing? This may be related to NaNs being caught up in the process. The team will investigate.

Hi, I'm also running into this issue. I am also using skrl:( And after some episodes all my reward, loss... are all NAN(at the same time when I got this warning!! I am wondering is it because of gradient explode/para explode or it's just skrl's problem.

willingxxia avatar May 25 '25 22:05 willingxxia

I suspect this may be due to the actions from the policies putting the robots into an undesired state, which leads to transformation calculation errors for the camera transforms. It might be helpful to try clipping the ranges of the actions to be a bit more conservative

kellyguo11 avatar Jul 07 '25 23:07 kellyguo11

The bug seems to be caused by my system's performance not being able to support 100 envs with cameras running simultaneously.After reducing the number of parallel envs,there have been no issues.

Atticlmr avatar Aug 14 '25 04:08 Atticlmr

I am also running into that issue. I even tried writing my own training script instead of the already provided one, but still the issue remains. With sb3, there is no issue. I get the warning and the training stops. I am using Isaac sim 5 and isaac lab 2.2.1

[Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 494 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

nitesh-subedi avatar Sep 03 '25 20:09 nitesh-subedi

Hello, I'm encountering the same issue after running training with SKRL for a while. It doesn't seem to matter whether I use 25 or 500 environments, the following warning keeps appearing repeatedly:

2025-09-29T13:10:12Z [1,238,990ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 494 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-09-29T13:10:12Z [1,238,990ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 494 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-09-29T13:10:12Z [1,238,990ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 494 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-09-29T13:10:12Z [1,238,990ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 494 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-09-29T13:10:12Z [1,238,990ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 494 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-09-29T13:10:12Z [1,238,990ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 494 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

2025-09-29T13:10:12Z [1,238,990ms] [Warning] [omni.usd] Warning (secondary thread): in Orthonormalize at line 494 of /builds/omniverse/usd-ci/USD/pxr/base/gf/matrix4d.cpp -- OrthogonalizeBasis did not converge, matrix may not be orthonormal.

shahizat avatar Sep 29 '25 13:09 shahizat

this bug is caused by isaacsim version did not match isaaclab version

Atticlmr avatar Nov 05 '25 17:11 Atticlmr