openexr icon indicating copy to clipboard operation
openexr copied to clipboard

Slow opening of exr image in GIMP

Open haaninjo opened this issue 1 year ago • 8 comments

Over in GIMP we got a report of .exr images being slow to load recently: https://gitlab.gnome.org/GNOME/gimp/-/issues/12169

Could reproduce this and see that it didn't happen when using an older version of openexr, so bisected this, and got this result:

7e0da7f5b79902e5a8e38f227d8e6f36cc3ef655 is the first bad commit
commit 7e0da7f5b79902e5a8e38f227d8e6f36cc3ef655 (HEAD)
Author: Kimball Thurston <[email protected]>
Date:   Sun Apr 28 12:01:36 2024 +1200

    Convert scanline input file to use the core
    
    simplifies input file as well. Future changes may lift common code out
    of scanline input into common utility in context / somewhere.

Test case: Exported https://img.photographyblog.com/reviews/nikon_d850/photos/nikon_d850_21.nef to exr in Darktable, and opened the resulting image in GIMP. On earlier versions of openexr it took about 4 seconds, while it takes 50 seconds after commit 7e0da7f5b79902e5a8e38f227d8e6f36cc3ef655.

haaninjo avatar Oct 15 '24 08:10 haaninjo

What platform and what specific version of OpenEXR?

Is this by any chance addressed by #1868? Can you check the very latest to verify that it's still a problem?

lgritz avatar Oct 15 '24 21:10 lgritz

My system is Debian Testing, original reporter was also on some Linux system.

Still slow opening when building OpenEXR from latest main commit https://github.com/AcademySoftwareFoundation/openexr/commit/bfbb0358731645c3a16529f7d1c53d54e4ddbba6

haaninjo avatar Oct 15 '24 22:10 haaninjo

This seems strange, when you exported from dark table, what settings did you use (I have a Nikon camera, so can likely use my own image to test with)? We have seen nothing like what is reported in terms of performance degradation, although also have not actively tested with gimp - I would suspect that we have unknowingly introduced a scenario where threading is not applied consistently during reading.

kthurston avatar Oct 17 '24 21:10 kthurston

Used Darktable 4.8.1 on Debian Testing. I don't think I changed any settings from default, see image:

Darktable-formatoptions

I see now that Darktable shows a warning on exporting: [exr export] warning: exporting with anything but linear matrix profiles might lead to wrong results when opening the image

haaninjo avatar Oct 17 '24 21:10 haaninjo

I see now that Darktable shows a warning on exporting: [exr export] warning: exporting with anything but linear matrix profiles might lead to wrong results when opening the image

This is just about gamma tone curve (i.e. one should explicitly choose a "linear" color profile w/o a tone curve in the darktable export settings or the output color profile module for EXRs because that is implicit for the format), and it shouldn't affect I/O...

kmilos avatar Oct 18 '24 15:10 kmilos

@haaninjo , can you somehow upload the exr file (not the nef file) here that can reproduce this issue? Will that one be proprietary and/or too big?

I'm thinking about repro this issue outside GIMP, which will help to nail down the exact mishap.

lji-ilm avatar Oct 22 '24 17:10 lji-ilm

@lji-ilm That file was 500 MB, but I have created a smaller example from cropping another NEF file and exporting to exr in Darktable.

http://norsjovallen.se/files/nikon_d850_01_01.exr (47 MB)

This smaller file takes upwards 2 seconds for me to open in the stable GIMP 2.10.38 flatpak, but around 10 seconds on GIMP from master with OpenEXR at commit https://github.com/AcademySoftwareFoundation/openexr/commit/bfbb0358731645c3a16529f7d1c53d54e4ddbba6

haaninjo avatar Oct 22 '24 19:10 haaninjo

Thanks very much for the isolated repro data. 2 seconds vs 10 seconds is a very good differentiator already.

I'll see if i can get the same results by simply reading it via openEXR, without GIMP.

lji-ilm avatar Oct 22 '24 20:10 lji-ilm

yeah, if I read this file using exrmetrics, it reports that it reads the entire file in 0.2 seconds, although I of course have a different system. The pattern gimp uses to read files is different, but is easy to create a version which reads in the same manner and find the root cause, I will get to that shortly

kdt3rd avatar Oct 24 '24 09:10 kdt3rd

Ah, yes, the fix I did that removed the cache which was causing memory bloat means that for multi-scanline compression (i.e. piz), the scanline chunk is not being cached from read to read, so if you are only reading one scanline at a time (which gimp and gegl both seem to do, as opposed to reading a scanline range - although not sure why they don't), you will incur re-decoding, slowing the read down...

kdt3rd avatar Oct 26 '24 06:10 kdt3rd

Just for the documentation purpose, here is a code review of the access pattern of GIMP into OpenEXR. GIMP is reading one scanline at a time, requiring OpenEXR to interleaving RGB on read, so the in-memory layout after read is already packed by pixels(R-G-B).

The driver code of "for all scanline in range, read one line at a time", is here:

https://github.com/GNOME/gimp/blob/15f5b15e69ba0698ec32d7a99f8ba3e31adeb31d/plug-ins/file-exr/file-exr.c#L312

The actual read dispatch into OpenEXR InputFile is here:

https://github.com/GNOME/gimp/blob/15f5b15e69ba0698ec32d7a99f8ba3e31adeb31d/plug-ins/file-exr/openexr-wrapper.cc#L217

Two features of this read code is noted:

  1. It prepare three Slice objects that interleaves RGB into the target memory pointed by the pixels pointer. There is a bit worry in the comment about EXR requires a pointer at the start of the data window, but GIMP actually don't have any memory before the display window -- but that is not directly related here.
  2. It then call InputFile.readPixels with only one parameter, which will make the underlying OpenEXR library to read only one scan line.

@lgritz I'm trying to reconstrue this case with oiio API calls and will keep you posted. Does oiio has some built in caching/optimization that will cache multiple scanlines so this behaviour can get lost before it reaches the target library? I'll also review it myself.

lji-ilm avatar Nov 05 '24 19:11 lji-ilm

OIIO's basic ImageInput class has methods that read one scanline, many adjacent scanlines, or the whole image (which really is just calling read_scanlines in reasonable sized chunks). It should translate pretty straightforwardly to the obvious OpenEXR read_scanlines call, no extra layer of caching. Also OIIO uses slices to get an interleaved buffer back.

lgritz avatar Nov 05 '24 20:11 lgritz

In the GIMP code, the loop which reads 'num' scanlines one at a time could be made significantly more efficient by calling setFrameBuffer and then readPixels once for each 'num' scanlines, rather than once per scanline. This minimizes how often the internal temporary buffer is required, though it would still be used if the slice boundaries in the file don't align with the tile boundaries that GIMP uses.

  • It prepare three Slice objects that interleaves RGB into the target memory pointed by the pixels pointer. There is a bit worry in the comment about EXR requires a pointer at the start of the data window, but GIMP actually don't have any memory before the display window -- but that is not directly related here.

This is required behavior: the Slice takes the pointer to pixel (0,0) in the buffer. If the dataWindow doesn't contain (0,0), then the pointer passed to Slice won't be within the allocated memory.

peterhillman avatar Nov 05 '24 22:11 peterhillman

I'm so slow on this, and now the actual issue has been fixed weeks ago, so the following results will be more relevant to our EXR team rather than GIMP, but I have successfully construed this case by a simple snippet in OIIO then recompile oiio against the bisected commit of openEXR and the subsequent patch.

I was able to observe a 13.4 second read on the bisected commit and a 2.15 second read after the patch. So 6.15 fold performance impact. This is primarily a build exercise for the benchmark project i'm pursuing.

auto pixel_row = std::unique_ptr<unsigned char[]>(new unsigned char[xres * bpp]);
steady_clock::time_point startRead = steady_clock::now();
for (int y = 0; y < yres; ++ y)
{
    inp->read_scanline(y, 0, pixelType, &pixel_row[0]);
}
steady_clock::time_point endRead = steady_clock::now();
std::cout << " Read time : " << timing(startRead, endRead) << std::endl;

lji-ilm avatar Nov 14 '24 00:11 lji-ilm