darktable icon indicating copy to clipboard operation
darktable copied to clipboard

darktable crashes on start when running on Fedora 40, AMD GPU and mesa-libOpenCL

Open magicgoose opened this issue 1 year ago • 2 comments

Describe the bug

darktable crashes on start (seg fault)

clinfo also crashes, so this is possibly not a problem with darktable but I thought to file the bug just in case it still gives useful information.

Steps to reproduce

  1. Start with Fedora 40
  2. Run sudo dnf install darktable (agree to install) - at this point darktable should work, but OpenCL is not supported
  3. Run sudo dnf install mesa-libOpenCL (agree to install)
  4. Run darktable darktable -d opencl

Expected behavior

not crash

Logfile | Screenshot | Screencast

darktable_backtrace.txt

Commit

No response

Where did you obtain darktable from?

distro packaging

darktable version

darktable 4.6.1

What OS are you using?

Linux

What is the version of your OS?

Fedora 40

Describe your system?

No response

Are you using OpenCL GPU in darktable?

Yes

If yes, what is the GPU card and driver?

AMD R590X

Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

The exact same GPU (and other hardware parts) worked with darktable (with OpenCL working) in Fedora 39 (rocm opencl package) and Arch Linux (amdgpu-pro from AUR). It also starts up with OpenCL on in Windows 10 on the same hardware.

In Fedora 40 there is apparently no combination that makes it work with OpenCL. (Without OpenCL everything is slow to the point of unusable.)

No difference between Wayland and X11.

magicgoose avatar May 12 '24 08:05 magicgoose

(I think this is the same driver that has a different kind of problem here https://github.com/darktable-org/darktable/issues/16717)

magicgoose avatar May 12 '24 08:05 magicgoose

Looks like there are a few teething problems related to the updated ROCm stack in F40, see also https://discuss.pixls.us/t/opencl-on-fedora-40-with-rocm-amd-not-working/43378

I wonder if seeking help in Fedora specific forums and support channels is maybe a better idea...

kmilos avatar May 12 '24 09:05 kmilos

clinfo also crashes,

In other words, you have a broken OpenCL stack.

so this is possibly not a problem with darktable but I thought to file the bug just in case it still gives useful information.

So, what do you expect from the darktable developers?

victoryforce avatar May 12 '24 15:05 victoryforce

Try with dnf install rocm-opencl. Remove the mesa one.

That's what I used on Fedora 39 with success, but on Fedora 40 the GPU is no longer recognizable as usable: darktable-cltest:

     0.0194 [dt_get_sysresource_level] switched to 2 as `large'
     0.0194   total mem:       64209MB
     0.0194   mipmap cache:    8026MB
     0.0194   available mem:   43893MB
     0.0194   singlebuff:      1003MB
     0.0313 [opencl_init] opencl disabled via darktable preferences
     0.0314 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL'
     0.0314 [dt_dlopencl_init] could not find default opencl runtime library 'libOpenCL.so'
     0.0314 [opencl_init] opencl library 'libOpenCL.so.1' found on your system and loaded, preference 'default path'
     0.0828 [opencl_init] found 1 platform
     0.0829 [check platform] platform 'AMD Accelerated Parallel Processing' with key 'clplatform_amdacceleratedparallelprocessing' is NOT active
[opencl_init] found 0 device
     0.0829 [opencl_init] FINALLY: opencl is NOT AVAILABLE and NOT ENABLED.

and clinfo has a NULL device, which probably shouldn't be so:

Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3602.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback 
  Platform Extensions function suffix             AMD
  Platform Host timer resolution                  1ns

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  AMD Accelerated Parallel Processing
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No devices found in platform [AMD Accelerated Parallel Processing?]
  clCreateContext(NULL, ...) [default]            No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No devices found in platform

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loaderns
  ICD loader Vendor                               OCL Icd free softwarens
  ICD loader Version                              2.3.2ns
  ICD loader Profile                              OpenCL 3.0ns

magicgoose avatar May 13 '24 17:05 magicgoose

magicgoose, the CL drivers needs to be available in your OS for darktable to be able to compile. I just updated to F40 today. ROCM-opencl loads correctly and clinfo reports the devices.

Thanks for the data point. Is this just a single package (rocm-opencl)? I also have it installed. And your GPU is not https://en.wikipedia.org/wiki/Radeon_500_series ? I've heard part of the problem is that AMD is neglecting some "older" devices in their recent linux driver updates, but hard to understand what is really happening there.

magicgoose avatar May 14 '24 06:05 magicgoose

@magicgoose The open status of the issue means that the developers have something to at least pay attention to, if not fix. Since this is not darktable issue, I see no point in keeping this issue open. So, I'm going to close the issue (or you can do it yourself).

victoryforce avatar Jul 03 '24 11:07 victoryforce