openexr icon indicating copy to clipboard operation
openexr copied to clipboard

OpenEXR 3.3.2 crashes when reading tiled files

Open rouault opened this issue 1 year ago • 2 comments

Has been noticed with GDAL when alpine:edge has upgraded from OpenEXR 3.1.13 to 3.3.2 (https://github.com/OSGeo/gdal/pull/11352)

Can be reproduced with the following sample file temp.exr.zip and the following procedure:

docker run --rm -it -v $PWD:$PWD alpine:edge
# under Docker
apk add cmake make g++ proj-dev openexr-dev curl
curl https://download.osgeo.org/gdal/3.10.0/gdal-3.10.0.tar.gz --output gdal-3.10.0.tar.gz
tar xvzf gdal-3.10.0.tar.gz 
cd gdal-3.10.0
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DGDAL_BUILD_OPTIONAL_DRIVERS=OFF -DOGR_BUILD_OPTIONAL_DRIVERS=OFF -DGDAL_ENABLE_DRIVER_EXR=ON -DACCEPT_MISSING_LINUX_FS_HEADER=ON
make -j$(nproc)
export LD_LIBRARY_PATH=$PWD
export PATH=$PWD/apps:$PATH
apk add valgrind
valgrind gdallocationinfo /path/to/temp.exr 0 0

outputs:

==4165== Memcheck, a memory error detector
==4165== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==4165== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info
==4165== Command: gdallocationinfo /home/even/gdal/gdal/build_ci_alpine/temp.exr 0 0
==4165== 
Report:
  Location: (0P,0L)
  Band 1:
==4165== Invalid write of size 8
==4165==    at 0x6B2F833: ??? (in /usr/lib/libOpenEXRCore-3_3.so.32.3.3.2)
==4165==    by 0x6B2FFEC: ??? (in /usr/lib/libOpenEXRCore-3_3.so.32.3.3.2)
==4165==    by 0x6B3828A: exr_decoding_run (in /usr/lib/libOpenEXRCore-3_3.so.32.3.3.2)
==4165==    by 0x655B75E: ??? (in /usr/lib/libOpenEXR-3_3.so.32.3.3.2)
==4165==    by 0x6558BFF: Imf_3_3::TiledInputFile::readTiles(int, int, int, int, int, int) (in /usr/lib/libOpenEXR-3_3.so.32.3.3.2)
==4165==    by 0x655B505: Imf_3_3::TiledInputFile::readTile(int, int, int, int) (in /usr/lib/libOpenEXR-3_3.so.32.3.3.2)
==4165==    by 0x53F7F66: GDALEXRRasterBand::IReadBlock(int, int, void*) (exrdataset.cpp:162)
==4165==    by 0x551A1F7: GDALRasterBand::GetLockedBlockRef(int, int, int) (gdalrasterband.cpp:2006)
==4165==    by 0x55FF07D: GDALRasterBand::IRasterIO(GDALRWFlag, int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) (rasterio.cpp:458)
==4165==    by 0x5518A5A: GDALRasterBand::RasterIO(GDALRWFlag, int, int, int, int, void*, int, int, GDALDataType, long long, long long, GDALRasterIOExtraArg*) (gdalrasterband.cpp:435)
[ ... snip ... ]
==4165==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==4165== 
==4165== 
==4165== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==4165==  Access not within mapped region at address 0x0
==4165==    at 0x6B2F833: ??? (in /usr/lib/libOpenEXRCore-3_3.so.32.3.3.2)
==4165==    by 0x6B2FFEC: ??? (in /usr/lib/libOpenEXRCore-3_3.so.32.3.3.2)
==4165==    by 0x6B3828A: exr_decoding_run (in /usr/lib/libOpenEXRCore-3_3.so.32.3.3.2)
==4165==    by 0x655B75E: ??? (in /usr/lib/libOpenEXR-3_3.so.32.3.3.2)
==4165==    by 0x6558BFF: Imf_3_3::TiledInputFile::readTiles(int, int, int, int, int, int) (in /usr/lib/libOpenEXR-3_3.so.32.3.3.2)
==4165==    by 0x655B505: Imf_3_3::TiledInputFile::readTile(int, int, int, int) (in /usr/lib/libOpenEXR-3_3.so.32.3.3.2)
==4165==    by 0x53F7F66: GDALEXRRasterBand::IReadBlock(int, int, void*) (exrdataset.cpp:162)
[ ... snip ... ]
==4165==  If you believe this happened as a result of a stack
==4165==  overflow in your program's main thread (unlikely but
==4165==  possible), you can try to increase the size of the
==4165==  main thread stack using the --main-stacksize= flag.
==4165==  The main thread stack size used in this run was 8388608.

rouault avatar Nov 25 '24 10:11 rouault

thanks for the report, this crash / memory sanitizer failure does not seem to immediately recreate with our own test harness which is unfortunate / disappointing, and says there is a missing assumption or manner in which people read images we are not testing as a use case, so will have to dig into that

kdt3rd avatar Nov 25 '24 11:11 kdt3rd

The crash also happens without Valgrind. It occurs around https://github.com/OSGeo/gdal/blob/master/frmts/exr/exrdataset.cpp#L131

rouault avatar Nov 25 '24 11:11 rouault