darktable icon indicating copy to clipboard operation
darktable copied to clipboard

openCL, rusticl: Square grid artifacts show up on raw photos from phones

Open garrett opened this issue 1 year ago • 5 comments

Describe the bug

I decided to try Rusticl with darktable, as ROCm OpenCL is not currently working on Fedora and I wanted to still use GPU acceleration with my AMD 7900 XTX.

Everything looked fine with Fuji and Ricoh files. When I looked at raw files from my Google Pixel 6 Pro, I noticed a tile grid that shows up on zoomed out previews, at 1:1, and on exports too. Further investigation through decades of raw files show that it also affects raw files from my old OnePlus 6T. Dedicated cameras, inclunding my own older ones through the years and play raws from discuss.pixls.us all seem to be fine, however.

Steps to reproduce

  1. Enable Rusticl.
    • On Fedora, it's in the mesa-libOpenCL package
    • The environmental variable RUSTICL_ENABLE=radeonsi needs to be set
    • darktable needs to opt-in to Rusticl in the "Processing" page in settings, and OpenCL also needs to be enabled too
    • More detailed instructions are @ https://discuss.pixls.us/t/opencl-on-fedora-40-with-rocm-amd-not-working/43378/5
  2. Load a raw file from a phone

Expected behavior

darktable should render the photo the same whether Rusticl OpenCL is on or off

Logfile | Screenshot | Screencast

Affected image (a snapshot I used to share a food pic with family; it has a solid color plate which shows the bug well):

image

It should look like this (this is a screenshot of the same photo with OpenCL off in darktable):

image

Screenshot crop of the affected image in darktable:

image

It affects exports too (this was exported at 1000px max wide and lower quality JPEG settings, but it affects higher quality and 1:1 exports too):

e6b91950f8883f3ef2b26f7a7e50b15430f39c3a

Commit

No response

Where did you obtain darktable from?

distro packaging

darktable version

darktable-4.6.1-5.fc40.x86_64

What OS are you using?

Linux

What is the version of your OS?

Fedora 40, Silverblue 40.20240430.1

Describe your system?

GPU: AMD 7900 XTX CPU: AMD Ryzen 9 7950X3D × 32 64 GB RAM X11 GNOME

Are you using OpenCL GPU in darktable?

Yes

If yes, what is the GPU card and driver?

Rusticl with AMD 7900 XTX

Please provide additional context if applicable. You can attach files too, but might need to rename to .txt or .zip

Dedicated camera files appear to be unaffected (Fuji, Ricoh, Canon, Nikon, Leica, Olympus). This seems to only affect phone cameras so far (Google Pixel 6 Pro, OnePlus 6T with a sideloaded Google Camera app — but not the default OnePlus camera app), in my testing.

So this might be specifically related to raws from the Google Camera app with Rusticl OpenCL. I've edited photos from this phone and camera app before with darktable and ROCm turned on using the same exact hardware and didn't see this problem.

Here's the raw file as featured in the screenshot above, compressed in a ZIP so GitHub would accept it: PXL_20240421_095931520.RAW-02.ORIGINAL.zip

garrett avatar May 01 '24 11:05 garrett

@karolherbst another one for you. it's in rawprepare module doing the 1f_gainmaps kernel.

jenshannoschwalm avatar May 01 '24 18:05 jenshannoschwalm

what's the mesa version? Maybe it doesn't have the fix? Does it even have the workaround you added to darktable?

karolherbst avatar May 01 '24 20:05 karolherbst

The version of Mesa installed is 24.0.6.

All specific Mesa packages in Fedora that are installed on my system are:

mesa-filesystem-24.0.6-2.fc40.x86_64
mesa-libxatracker-24.0.6-2.fc40.x86_64
mesa-va-drivers-24.0.6-2.fc40.x86_64
mesa-vulkan-drivers-24.0.6-2.fc40.x86_64
mesa-libglapi-24.0.6-2.fc40.x86_64
mesa-dri-drivers-24.0.6-2.fc40.x86_64
mesa-libgbm-24.0.6-2.fc40.x86_64
mesa-libEGL-24.0.6-2.fc40.x86_64
mesa-libGL-24.0.6-2.fc40.x86_64
mesa-libOpenCL-24.0.6-2.fc40.x86_64

garrett avatar May 01 '24 20:05 garrett

@karolherbst I think it's something different here. Got some debugging pfm file from the rawprepare module, the issue is only evident if the special kernel handling the Gainmaps is in use. See 'data/kernels/basic.cl'

jenshannoschwalm avatar May 02 '24 05:05 jenshannoschwalm

okay, yeah and the fix I've written for the last issue is part of 24.0.6, just wanted to make sure it's a new one. Will take a look next week or so, because technically I'm off this week.

karolherbst avatar May 02 '24 08:05 karolherbst

okay, I can reproduce the issue with rusticl on my AMD card, but not on my Intel one. Hopefully I'll be able to figure out quickly what's going on here.

karolherbst avatar May 13 '24 10:05 karolherbst

I suspect the interpolator, in dt we only use that here in this kernel.

jenshannoschwalm avatar May 13 '24 12:05 jenshannoschwalm

looks like the output of rawprepare_1f_gainmap is the first thing different and significantly enough to explain the wrong output. Will need to dig deeper on what's happening there.

karolherbst avatar May 13 '24 13:05 karolherbst

okay figured it out. There seems to be something going wrong with samplers and on radeonsi specifically it ends up using the sampleri instead, so it ends up doing nearest filtering instead of linear.

I think I already know why it happens, but will have to play around a bit and might come up with a fix tomorrow or so

karolherbst avatar May 15 '24 20:05 karolherbst

Actually it was something else than I thought and kinda simple. In any case, upstream MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29230 and marked as a backport candidate.

karolherbst avatar May 15 '24 21:05 karolherbst

Closing this as fixed upstream.

jenshannoschwalm avatar May 16 '24 17:05 jenshannoschwalm