IsaacLab icon indicating copy to clipboard operation
IsaacLab copied to clipboard

[Question] Running isaacsim on a Slurm compute node

Open writingindy opened this issue 9 months ago • 3 comments

Question

Hi,

I'm a system administrator for a HPC cluster; I've been reached out by users who want to run isaacsim/isaaclab on the compute nodes of the cluster, but I've been running issues into setting it up. I'm running this on my account on the cluster to figure out the steps so they can just repeat it and get it installed.

I follow the instructions here, and when I try to run isaacsim it gives me Segmentation fault (core dumped).

Some details about the compute node:

  • the compute node is an LXD container running Ubuntu 22.04
  • GLIBC 2.35
  • CUDA 12.8
  • NVIDIA driver version: 570.124.06

I think one of the possible issues is that I don't have the Vulkan driver installed? Do I need to install the Vulkan SDK as well? We do not have Singularity on the cluster, but we do have Docker as an option for users to run containers.

I can provide more details if necessary, thank you!

writingindy avatar Apr 14 '25 20:04 writingindy

Thank you for posting this. It is likely an Ubuntu library conflict. Have you tried following the instructions here instead?

RandomOakForest avatar Apr 15 '25 13:04 RandomOakForest

Following the instructions to install Isaac Sim, I run both ./isaac-sim.selector.sh and ./isaac-sim.sh and though the first command gets to the point where it says app ready there were errors that seem to point to driver issues:

2025-04-15 17:00:46 [2,594ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_INCOMPATIBLE_DRIVER
2025-04-15 17:00:46 [2,594ms] [Error] [carb.graphics-vulkan.plugin] vkCreateInstance failed. Vulkan 1.1 is not supported, or your driver requires an update.
2025-04-15 17:00:46 [2,594ms] [Error] [gpu.foundation.plugin] carb::graphics::createInstance failed.
2025-04-15 17:00:47 [3,136ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_INCOMPATIBLE_DRIVER
2025-04-15 17:00:47 [3,136ms] [Error] [carb.graphics-vulkan.plugin] vkCreateInstance failed. Vulkan 1.1 is not supported, or your driver requires an update.
2025-04-15 17:00:47 [3,136ms] [Error] [gpu.foundation.plugin] carb::graphics::createInstance failed.
2025-04-15 17:00:47 [3,645ms] [Error] [omni.gpu_foundation_factory.plugin] Failed to create any GPU devices, including an attempt with compatibility mode.

For ./isaac-sim.sh it just gives me Segmentation fault (core dumped), but with more detailed errors. I've attached the output for both commands in this comment.

isaac-sim.selector.log isaac-sim.log

writingindy avatar Apr 15 '25 17:04 writingindy

Actually there was an issue with the container that I resolved; now it can detect CUDA Toolkit and the NVIDIA devices when I run ./isaac-sim.sh but it still gives me a segmentation fault (the output of which I've attached to this comment).

isaac-sim-2.log

writingindy avatar Apr 15 '25 18:04 writingindy

I am currently encountering the same issue. Have you resolved it by now?

lovelyppp avatar May 28 '25 10:05 lovelyppp

Following up, your log file points to an issue with the driver. Please verify you are using the recommended driver. If this is the case, please open a new issue as a bug report and include an updated error log. Thank you.

RandomOakForest avatar Jun 02 '25 20:06 RandomOakForest