OpenImageIO icon indicating copy to clipboard operation
OpenImageIO copied to clipboard

[BUG] Performance regression loading EXR files

Open jreichel-nvidia opened this issue 9 months ago • 1 comments
trafficstars

Describe the bug

We noticed a performance regressions between OIIO 2.5.16.0 and 3.0.0.3 when loading EXR files:

With vcpkg 6e1219d^ (OIIO 2.5.16.0): t1: 0.00086496 t2: 0.0387352 sum: 0.0396002

With vcpkg 6e1219d (OIIO 3.0.0.3): t1: 0.0155549 t2: 0.0329414 sum: 0.0484963

(averages of 100 runs, release build)

Not only is the sum higher than before, but the first number t1 (just obtaining the ImageInput/ImageSpec to read the metadata) is much higher than before.

OpenImageIO version and dependencies

vcpkg 6e1219d^ OIIO 2.5.16.0 | Linux/x86_64 Build compiler: gcc 12.2 | C++17/201703 HW features enabled at build: sse2 Dependencies: Boost 1.86.0, BZip2 1.0.8, fmt 11.0.2, JPEG 62, libjpeg-turbo 3.0.4, OpenEXR 3.3.1, OpenGL, PNG 1.6.44, Qt6 6.4.2, Robinmap, TIFF 4.7.0, ZLIB 1.3.1

vcpkg 6e1219d OIIO 3.0.0.3 | Linux/x86_64 Build compiler: gcc 12.2 | C++17/201703 HW features enabled at build: sse2 No CUDA support (disabled / unavailable at build time) Dependencies: fmt 11.0.2, Imath 3.1.12, JPEG 62, JXL NONE, libjpeg-turbo 3.0.4, OpenEXR 3.3.1, OpenGL, PNG 1.6.44, Python3 3.11.2, Qt6 6.4.2, Robinmap, TIFF 4.7.0, ZLIB 1.3.1

To Reproduce

The test is attached. The texture (to be passed as first argument) is at https://github.com/NVIDIA/MDL-SDK/blob/master/examples/mdl_sdk/dxr/content/hdri/hdrihaven_teufelsberg_inner_2k.exr

CMakeLists.txt

test.cpp.txt

jreichel-nvidia avatar Feb 07 '25 12:02 jreichel-nvidia

This is caused by new code setting the colorspace in exrinput.cpp:400, which triggers OCIO initialization via OIIO::ColorConfig::default_colorconfig(). So this is a one-time cost and calling this method upfront (or doing all iterations within a single process) results in comparable numbers.

Still surprising that this step is that expensive: the numbers above are +9ms, but in a larger benchmark we see up to 60-100ms (different machine, Windows).

jreichel-nvidia avatar Feb 11 '25 14:02 jreichel-nvidia