compute-runtime icon indicating copy to clipboard operation
compute-runtime copied to clipboard

Level Zero Runtime Crash on Ubuntu 24.04 (Arc A770M, xe driver) - bindless_heaps_helper.cpp

Open markddrake opened this issue 1 month ago • 1 comments

I am running a clean installation of Ubuntu 24.04 LTS on an Intel NUC 12 Enthusiast (Serpent Canyon) with an Arc A770M dGPU.

The Intel graphics stack was installed via the recommended ppa:kobuk-team/intel-graphics PPA.

sudo apt update
sudo apt install -y software-properties-common dirmngr

# Add the Intel Graphics PPA
sudo add-apt-repository -y ppa:kobuk-team/intel-graphics

# Update package lists
sudo apt update

sudo apt install -y intel-opencl-icd libze-intel-gpu1 xpu-smi intel-media-va-driver-non-free


Update Grub to force the xe driver
vi /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash i915.force_probe=!5690 xe.force_probe=5690"
update-grub

reboot

The correct xe kernel driver has been successfully forced for the dGPU.

Problem: The Level Zero compute runtime is crashing immediately upon attempting to run diagnostics, making all compute applications unusable.

Steps to Reproduce:

Ensure libze-intel-gpu1 is installed from the PPA.

Ensure xe is loaded for the dGPU (lspci -k confirms xe).

Run the diagnostic tool: /usr/bin/xpu-smi diag -d 0 -l 2

Observed Result:

Abort was called at 70 line in file: ./shared/source/helpers/bindless_heaps_helper.cpp Aborted (core dumped) System Details:

OS: Ubuntu 24.04 LTS

Kernel: markddrake@deep-thought:~$ uname -r 6.8.0-87-generic

Hardware: Intel NUC 12 Enthusiast Kit (Serpent Canyon), Arc A770M

markddrake@deep-thought:~$ uname -r 6.8.0-87-generic

root@deep-thought:/home/markddrake# /usr/bin/xpu-smi diag -d 0 -l 2 Abort was called at 70 line in file: ./shared/source/helpers/bindless_heaps_helper.cpp Aborted (core dumped) root@deep-thought:/home/markddrake#

Please let me know if you additional info or the core ?

markddrake avatar Nov 16 '25 17:11 markddrake

Hi @markddrake,

Thank you for your contribution and for bringing this issue to our attention. We made an attempt to recreate the problem you described and, indeed, on the provided KMD stack, the issue can be observed. However, during further testing, we found that the issue is no longer visible on the newer KMD version.

We recommend updating to the stable kernel release (6.17.8) where this problem does not occur.

We appreciate your feedback and your efforts to help improve the project. If you encounter any further issues, please don't hesitate to let us know!

kgibala avatar Nov 18 '25 08:11 kgibala