darktable icon indicating copy to clipboard operation
darktable copied to clipboard

OpenCL: segmentation fault

Open mrusme opened this issue 1 year ago • 5 comments

Describe the bug

After dealing with constant crashes in #16985 I gave darktable --configdir "/tmp" a try and enabled OpenCL + ROCm (both installed and compiled in). The moment I imported a single photo and tried to edit it, darktable crashed and it keeps doing so:

[1]    25857 segmentation fault  darktable --configdir "/tmp"

Steps to reproduce

  1. darktable --configdir "/tmp"
  2. Enable OpenCL and ROCm
  3. Import a single photo
  4. Double click the photo

Expected behavior

Not crash

Logfile | Screenshot | Screencast

No response

Commit

No response

Where did you obtain darktable from?

self compiled

darktable version

darktable 4.6.1

What OS are you using?

Linux

What is the version of your OS?

Gentoo

Describe your system?

Same as #16985

AMD Ryzen 7 5800U with Radeon Graphics, ROCm installed, Wayland

equery u darktable
[ Legend : U - final flag setting for installation]
[        : I - package is installed with flag     ]
[ Colors : set, unset                             ]
 * Found these USE flags for media-gfx/darktable-4.6.1:
 U I
 + + avif                     : Add AV1 Image Format (AVIF) support
 - - colord                   : Support color management using x11-misc/colord
 + + cpu_flags_x86_avx        : Adds support for Advanced Vector Extensions instructions
 + + cpu_flags_x86_sse3       : Use the SSE3 instruction set ([pni] in cpuinfo, NOT ssse3)
 - - cups                     : Add support for CUPS (Common Unix Printing System)
 - - doc                      : Add extra documentation (API, Javadoc, etc). It is recommended to enable per package instead of globally
 - - gamepad                  : Support using game controllers as input devices
 + + geolocation              : Enable geotagging support
 + + gphoto2                  : Add digital camera support
 - - graphicsmagick           : Build and link against GraphicsMagick instead of ImageMagick (requires USE=imagemagick if optional)
 + + heif                     : Enable support for ISO/IEC 23008-12:2017 HEIF/HEIC image format
 - - jpeg2k                   : Support for JPEG 2000, a wavelet-based image compression format
 + + jpegxl                   : Add JPEG XL image support
 - - keyring                  : Enable support for freedesktop.org Secret Service API password store
 - - kwallet                  : Enable encrypted storage of passwords with kde-frameworks/kwallet
 - - l10n_cs                  : Czech
 - - l10n_de                  : German
 - - l10n_es                  : Spanish
 - - l10n_fi                  : Finnish
 - - l10n_fr                  : French
 - - l10n_hu                  : Hungarian
 - - l10n_it                  : Italian
 - - l10n_ja                  : Japanese
 - - l10n_nl                  : Dutch
 - - l10n_pl                  : Polish
 - - l10n_pt-BR               : Portuguese (Brazil)
 - - l10n_ru                  : Russian
 - - l10n_sl                  : Slovenian
 - - l10n_sq                  : Albanian
 - - l10n_tr                  : Turkish
 - - l10n_uk                  : Ukrainian
 - - l10n_zh-CN               : Chinese (China)
 - - l10n_zh-TW               : Chinese (Taiwan)
 - - lto                      : Enable link-time optimisations in the RawSpeed library
 + + lua                      : Enable Lua scripting support
 + + lua_single_target_lua5-4 : Build for Lua 5.4 only
 - - midi                     : Support using MIDI input devices such as Behringer X-Touch Mini, Arturia Beatstep or Korg nanoKONTROL2, as
                                input devices
 + + nls                      : Add Native Language Support (using gettext - GNU locale utilities)
 + + opencl                   : Enable OpenCL support (computation on GPU)
 + + openexr                  : Support for the OpenEXR graphics file format
 + + openmp                   : Build support for the OpenMP (support parallel computing), requires >=sys-devel/gcc-4.2 built with
                                USE="openmp"
 - - test                     : Enable dependencies and/or preparations necessary to run tests (usually controlled by FEATURES=test but
                                can be toggled independently)
 + + tools                    : Install tools for generating base curves and noise profiles
 + + webp                     : Add support for the WebP image format

Are you using OpenCL GPU in darktable?

Yes

If yes, what is the GPU card and driver?

AMD, amdgpu, ROCm

Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

No response

mrusme avatar Jun 13 '24 16:06 mrusme

We have had a load of amd opencl and arch based distros related problems plus bad local builds.

To check this we need at least a proper log with -d pipe -d opencl options. No chance otherwise.

jenshannoschwalm avatar Jun 15 '24 08:06 jenshannoschwalm

@jenshannoschwalm thank you for the reply. Does this help?

darktable --configdir "/tmp" -d pipe -d opencl
darktable 4.6.1
Copyright (C) 2012-2024 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.2.0
  Colord                 -> DISABLED
  gPhoto2                -> ENABLED
  GMIC                   -> DISABLED - Compressed LUTs are NOT supported
  GraphicsMagick         -> DISABLED
  ImageMagick            -> DISABLED
  libavif                -> ENABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  OpenJPEG               -> DISABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

     0.0480 [dt_get_sysresource_level] switched to 1 as `default'
     0.0480   total mem:       63688MB
     0.0480   mipmap cache:    7961MB
     0.0480   available mem:   31844MB
     0.0480   singlebuff:      497MB
     0.0487 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
     0.0926 [opencl_init] found 1 platform
[opencl_init] found 1 device

[dt_opencl_device_init]
   DEVICE:                   0: 'gfx90c:xnack-', NEW
   PLATFORM, VENDOR & ID:    AMD Accelerated Parallel Processing, Advanced Micro Devices, Inc., ID=4098
   CANONICAL NAME:           amdacceleratedparallelprocessinggfx90cxnack
   DRIVER VERSION:           3590.0 (HSA1.1,LC)
   DEVICE VERSION:           OpenCL 2.0
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          512 MB
   MAX MEM ALLOC:            384 MB
   MAX IMAGE SIZE:           16384 x 16384
   MAX WORK GROUP SIZE:      256
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 1024 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /usr/share/darktable/kernels
   KERNEL DIRECTORY:         /home/mrus/.cache/darktable/cached_v3_kernels_for_AMDAcceleratedParallelProcessinggfx90cxnack_35900HSA11LC
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   CL COMPILER COMMAND:      -w -cl-fast-relaxed-math  -DAMD=1 -I"/usr/share/darktable/kernels"
   KERNEL LOADING TIME:       0.0357 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init]		0	'AMD Accelerated Parallel Processing gfx90c:xnack-'
     0.3513 [opencl_init] FINALLY: opencl is AVAILABLE and ENABLED.
[opencl_init] opencl_scheduling_profile: 'default'
[opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
[opencl_init] opencl_mandatory_timeout: 400
     0.3513 [opencl_init] set scheduling profile to default, setup has changed.
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	-1	0	0	-1
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 200
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	-1	0	0	-1
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 200
    12.5744 dt_dev_pixelpipe_synch_all [full]                                  defaults 0.0441s, history 0.0117s
    12.5745 pixelpipe_cache_checkmem   [full]                                  64 lines (important=0, used=0). Freed 0MB. Using using 0MB, limit=995MB
    12.5745 pixelpipe starting CL      [full]                                  (   0/   0)  718x 957 scale=0.2374 --> (   0/   0)  718x 957 scale=0.2374 device=0 (amdacceleratedparallelprocessinggfx90cxnack)
    12.5745 [dt_opencl_check_tuning] use 12025908428739MB (headroom=OFF, pinning=OFF) on device `AMD Accelerated Parallel Processing gfx90c:xnack-' id=0
    12.5746 pixelpipe data: clip&zoom  [full]                                  (   0/   0) 3024x4032 scale=1.0000 --> (   0/   0)  718x 957 scale=0.2374
[1]    9579 segmentation fault  darktable --configdir "/tmp" -d pipe -d opencl

If not, kindly let me know what other parameters you'd like me to add to the darktable command.

mrusme avatar Jun 15 '24 16:06 mrusme

Yes it helps me. There was a bug in 4.6 - fixed in master - wrongly calculating mem for such low spec cards (Less than 500mb of graphics mem) resulting in the crash. You should definitely disable opencl. Until 4.8 you might use 4.6 from commend line as dark table --disable-opencl

jenshannoschwalm avatar Jun 15 '24 16:06 jenshannoschwalm

DEVICE_TYPE: GPU, dedicated mem GLOBAL MEM SIZE: 512 MB

I own a similar CPU/iGPU. The AMD drivers incorrect report this as dedicated memory and a very small size. It should be shared memory. I disabled rocm until there is a fix upstream.

gi-man avatar Jun 16 '24 22:06 gi-man

@mrusme Could you please check if 4.8.0 fixes this problem?

victoryforce avatar Jul 03 '24 09:07 victoryforce

@victoryforce I have tested it just now after compiling the new version from the Gentoo portage:

▲ ~ darktable --version
darktable 4.8.0
Copyright (C) 2012-2024 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.3.0
  Colord                 -> DISABLED
  gPhoto2                -> ENABLED
  GMIC                   -> DISABLED - Compressed LUTs are NOT supported
  GraphicsMagick         -> DISABLED
  ImageMagick            -> DISABLED
  libavif                -> ENABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  OpenJPEG               -> DISABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.
▲ ~ darktable --configdir "/tmp" -d pipe -d opencl
darktable 4.8.0
Copyright (C) 2012-2024 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.3.0
  Colord                 -> DISABLED
  gPhoto2                -> ENABLED
  GMIC                   -> DISABLED - Compressed LUTs are NOT supported
  GraphicsMagick         -> DISABLED
  ImageMagick            -> DISABLED
  libavif                -> ENABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  OpenJPEG               -> DISABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

     0.0638 [dt_get_sysresource_level] switched to 1 as `default'
     0.0639   total mem:       63688MB
     0.0639   mipmap cache:    7961MB
     0.0639   available mem:   31844MB
     0.0639   singlebuff:      497MB
     0.0650 [opencl_init] opencl library 'libOpenCL' found on your system and loaded, preference 'default path'
     0.1649 [opencl_init] found 1 platform
[opencl_init] found 1 device

[dt_opencl_device_init]
   DEVICE:                   0: 'gfx90c:xnack-', NEW
   CONF KEY:                 cldevice_v5_amdacceleratedparallelprocessinggfx90cxnack
   PLATFORM, VENDOR & ID:    AMD Accelerated Parallel Processing, Advanced Micro Devices, Inc., ID=4098
   CANONICAL NAME:           amdacceleratedparallelprocessinggfx90cxnack
   DRIVER VERSION:           3590.0 (HSA1.1,LC)
   DEVICE VERSION:           OpenCL 2.0
   DEVICE_TYPE:              GPU, dedicated mem
   GLOBAL MEM SIZE:          31844 MB
   MAX MEM ALLOC:            27068 MB
   MAX IMAGE SIZE:           16384 x 16384
   MAX WORK GROUP SIZE:      256
   MAX WORK ITEM DIMENSIONS: 3
   MAX WORK ITEM SIZES:      [ 1024 1024 1024 ]
   ASYNC PIXELPIPE:          NO
   PINNED MEMORY TRANSFER:   NO
   AVOID ATOMICS:            NO
   MICRO NAP:                250
   ROUNDUP WIDTH & HEIGHT    16x16
   CHECK EVENT HANDLES:      128
   TILING ADVANTAGE:         0.000
   DEFAULT DEVICE:           NO
   KERNEL BUILD DIRECTORY:   /usr/share/darktable/kernels
   KERNEL DIRECTORY:         /home/mrus/.cache/darktable/cached_v3_kernels_for_AMDAcceleratedParallelProcessinggfx90cxnack_35900HSA11LC
   CL COMPILER OPTION:       -cl-fast-relaxed-math
   CL COMPILER COMMAND:      -w -cl-fast-relaxed-math  -DAMD=1 -I"/usr/share/darktable/kernels"
   KERNEL LOADING TIME:       8.6826 sec
[opencl_init] OpenCL successfully initialized. internal numbers and names of available devices:
[opencl_init]		0	'AMD Accelerated Parallel Processing gfx90c:xnack-'
     9.1200 [opencl_init] FINALLY: opencl PREFERENCE=ON is AVAILABLE and ENABLED.
[opencl_init] opencl_scheduling_profile: 'default'
[opencl_init] opencl_device_priority: '*/!0,*/*/*/!0,*'
[opencl_init] opencl_mandatory_timeout: 400
     9.1201 [opencl_init] set scheduling profile to default, setup has changed.
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	-1	0	0	-1
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 200
[dt_opencl_update_priorities] these are your device priorities:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	-1	0	0	-1
[dt_opencl_update_priorities] show if opencl use is mandatory for a given pixelpipe:
[dt_opencl_update_priorities] 		image	preview	export	thumbs	preview2
[dt_opencl_update_priorities]		0	0	0	0	0
[opencl_synchronization_timeout] synchronization timeout set to 200
    22.0088 changed CAT for channelmixerrgb from (nil) to 0x5591e7032ad0
    22.1208 used preset                                    temperature            preset='as shot to reference': D65 2.672 1.000 1.347, AS-SHOT 2.355 1.000 1.617
    22.1369 [generate_profile_info] profile `<internal>': color space `RGB '
    22.3295 pipe state changing           [full]                                  zoomed, synch all,
    22.3332 [generate_profile_info] profile `<internal>': color space `RGB '
    22.3336 [generate_profile_info] profile `<internal>': color space `    22.3538 [generate_profile_info] profile `<internal>': color space `RGB '
    22.3700 modify roi OUT                [full]           flip                   (   0/   0) 6048x4024 scale=1.0000 --> (   0/   0) 4024x6048 scale=1.0000 ID=1
    22.3701 pipe cache check              [full]                                  64 lines (important=0, used=0). Freed 0MB. Using using 0MB, limit=995MB
    22.3701 pipe starting             CL0 [full]                                  (   0/   0)  718x1079 scale=0.1784 --> (   0/   0)  718x1079 scale=0.1784 ID=1, amdacceleratedparallelprocessinggfx90cxnack
    22.3701 [dt_opencl_check_tuning] use 21358MB (headroom=OFF, pinning=OFF) on device `AMD Accelerated Parallel Processing gfx90c:xnack-' id=0
    22.3702 modify roi IN                 [full]           flip                   (   0/   0) 1079x 718 scale=0.1784 --> (   0/   0)  718x1079 scale=0.1784 ID=1
    22.3702 modify roi IN                 [full]           demosaic               (   0/   0) 6047x4024 scale=1.0000 --> (   0/   0) 1079x 718 scale=0.1784 ID=1
    22.3702 modify roi IN                 [full]           highlights             (   0/   0) 6048x4024 scale=1.0000 --> (   0/   0) 6047x4024 scale=1.0000 ID=1
    22.3702 pipe data: full               [full]                                  (   0/   0) 6048x4024 scale=1.0000 --> (   0/   0) 6048x4024 scale=1.0000
    22.3921 process                   CL0 [full]           rawprepare             (   0/   0) 6048x4024 scale=1.0000 --> (   0/   0) 6048x4024 scale=1.0000   1 IOP_CS_RAW
    22.4127 process                   CL0 [full]           temperature            (   0/   0) 6048x4024 scale=1.0000 --> (   0/   0) 6048x4024 scale=1.0000   3 IOP_CS_RAW
    22.4295 process                   CL0 [full]           highlights             (   0/   0) 6048x4024 scale=1.0000 --> (   0/   0) 6047x4024 scale=1.0000   4 IOP_CS_RAW
    22.4501 opposed chroma            CL0 [full]           highlights             (   0/   0) 6048x4024 scale=1.0000 --> (   0/   0) 6047x4024 scale=1.0000 red: 0.0000, green: 0.0000, blue: 0.0000 for hash=2e8f5d919320142, saved to cache, unclipped
[1]    18926 segmentation fault  darktable --configdir "/tmp" -d pipe -d opencl

The crash happened when I tried adding a single DNG to the empty library.

mrusme avatar Jul 13 '24 02:07 mrusme

@mrusme i would be very interested in the problematic raw & xmp files you have been using here. Would you be able to share here? Or email me [email protected] - i would make sure to use them only to investigate further?

jenshannoschwalm avatar Jul 13 '24 04:07 jenshannoschwalm

Hey @jenshannoschwalm, sure thing, here you go!

DSC08412.ARW.tar.gz

It does however happen with every photo that I tested.

mrusme avatar Jul 13 '24 17:07 mrusme

Thanks for the files. Will investigate this issue further.

ATM i am not sure if we do something bad in roi_in calculation of demosaic module or if there is an AMD specific bug in opposed highlights OpenCL code ...

jenshannoschwalm avatar Jul 14 '24 18:07 jenshannoschwalm