openjpeg icon indicating copy to clipboard operation
openjpeg copied to clipboard

opj2_decompress segfaults with large file

Open swalterfub opened this issue 10 years ago • 23 comments

opj2_decompress with OpenJPEG 2.1.0 ond CentOS 7.2.1511 breaks with "Segmentation fault" with signal 11 in opj_t1_decode_cblks ().

To reproduce, the large JP2 can be downloaded at http://hirise-pds.lpl.arizona.edu/PDS/RDR/ESP/ORB_028000_028099/ESP_028011_2055/ESP_028011_2055_RED.JP2

opj2_dump -i ESP_028011_2055_RED.JP2

gives:

[INFO] Start to read j2k main header (678).
[INFO] Main header has been correctly decoded.
Image info {
     x0=0, y0=0
     x1=28260, y1=52834
     numcomps=1
         component 0 {
         dx=1, dy=1
         prec=10
         sgnd=0
    }
}
Codestream info from main header: {
     tx0=0, ty0=0
     tdx=28260, tdy=52834
     tw=1, th=1
     default tile {
         csty=0x1
         prg=0x3
         numlayers=1
         mct=0
         comp 0 {
             csty=0x1
             numresolutions=10
             cblkw=2^6
             cblkh=2^6
             cblksty=0
             qmfbid=1
             preccintsize (w,h)=(8,8) (8,8) (8,8) (8,8) (8,8) (8,8) (8,8) (8,8) (8,8) (8,8) 
             qntsty=0
             numgbits=1
             stepsizes (m,e)=(0,11) (0,12) (0,12) (0,13) (0,12) (0,12) (0,13) (0,12) (0,12) (0,13) (0,12) (0,12) (0,13) (0,12) (0,12) (0,13) (0,12) (0,12) (0,13) (0,12) (0,12) (0,13) (0,12) (0,12) (0,12) (0,11) (0,11) (0,12) 
             roishift=0
         }
     }
}
Codestream index from main header: {
     Main header start position=678
     Main header end position=1010
     Marker list: {
         type=0xff4f, pos=678, len=2
         type=0xff51, pos=680, len=43
         type=0xff52, pos=723, len=24
         type=0xff5c, pos=747, len=33
         type=0xff64, pos=780, len=49
         type=0xff64, pos=829, len=56
         type=0xff64, pos=885, len=19
         type=0xff64, pos=904, len=94
         type=0xff55, pos=998, len=12
     }
}


swalterfub avatar Mar 07 '16 17:03 swalterfub

openjpeg git master does not crash on Debian GNU Linux Stretch, but fails with a memory error:

 [INFO] Start to read j2k main header (678).
 [INFO] Main header has been correctly decoded.
 [INFO] No decoded area parameters, set the decoded area to the whole image
 [ERROR] Not enough memory for tile data
 [ERROR] Cannot decode tile, memory error
 [ERROR] Failed to decode the codestream in the JP2 file
 ERROR -> opj_decompress: failed to decode image!

My host has 6 GB of memory available.

stweil avatar Mar 07 '16 18:03 stweil

The code in opj_tcd_init_tile includes a size limit of 1073741823 (< 1 GiB):

736         if ((((OPJ_UINT32)-1) / (OPJ_UINT32)sizeof(OPJ_UINT32)) < l_data_size) {
737             opj_event_msg(manager, EVT_ERROR, "Not enough memory for tile data\n");
738             return OPJ_FALSE;
739         }

l_data_size is 1493088840 in your test case. So it is not a limitation of my system but of the current implementation: increasing RAM would not help.

stweil avatar Mar 07 '16 18:03 stweil

ESP_028011_2055_RED.JP2 has size 811'792'452 B.

I am on '/sources':

Filesystem 1K-blocks Used Available Use% Mounted on 1607579532 137528796 1388367360 10% /sources

I have just downloaded openjpeg-master.

bin/opj_decompress -i ../../ESP_028011_2055_RED.JP2 -o ESP_028011_2055_RED.JP2.png

[INFO] Start to read j2k main header (678). [INFO] Main header has been correctly decoded. [INFO] No decoded area parameters, set the decoded area to the whole image [ERROR] Not enough memory for tile data [ERROR] Cannot decode tile, memory error [ERROR] Failed to decode the codestream in the JP2 file ERROR -> opj_decompress: failed to decode image!

Main memory is 16 GB.

kdu_expand -i ESP_028011_2055_RED.JP2 -o ESP_028011_2055_RED.JP2.tif

Copying Geo box info, size = 499

Consumed 1 tile-part(s) from a total of 1 tile(s). Consumed 811,791,772 codestream bytes (excluding any file format) = 4.349597 bits/pel. Processed using the multi-threaded environment, with 6 parallel threads of execution

ESP_028011_2055_RED.JP2.tif has size 1'866'362'507 B.

winfried

szukw000 avatar Mar 07 '16 19:03 szukw000

Thanks for pointing me in the right direction. There seems to be a hard-coded 4GiB tile limit in the library. There is no way to remove the limit from the orginal source, but I stumbled across this fork of OpenJPEG: https://github.com/GrokImageCompression/ronin Commenting out lines 738-741 of src/lib/openjp2/tcd.cpp lead to a working solution! I don't know why there is a 32bit Int limit there, but the code seems to work without it. Maybe it should be changed to a 64bit Int limit? Sebastian

swalterfub avatar Mar 08 '16 09:03 swalterfub

By the way, removing the 32bit int limit causes one unit test to fail. Since only one unit test fails, I think it is safe to remove this limit, but we must investigate why the unit test fails.

boxerab avatar Mar 23 '16 01:03 boxerab

This is a 10 bit monochrome file.

Decompression seems to work fine. I decompressed to TIFF, but not all viewers will correctly display 10 bit monochrome TIFF files.

You can use graphics magick to convert to 12 bits:

gm convert foo.tif -define tiff:bits-per-sample=12 bar.tif

boxerab avatar Mar 23 '16 02:03 boxerab

I'm afraid that if you simply remove the limiting code, the resulting code will decompress only some part of the original image.

I just tried to decompress the large file with OpenJPEG and some more modifications (opj_decompress -i ESP_028011_2055_RED.JP2 -o image.png). After about 30 minutes where my 6 GB RAM were used up to 95 % the program tried to get even more memory, but was killed by the Linux kernel. The last function which was called but did not return was opj_tcd_update_tile_data.

stweil avatar Mar 23 '16 19:03 stweil

Yes, true for OpenJPEG. The Ronin project decompresses the entire image. So, this limitation has been permanently removed on that project.

boxerab avatar Mar 23 '16 20:03 boxerab

How much RAM does it use? How long does it take?

stweil avatar Mar 23 '16 20:03 stweil

The Ronin project also uses uint32_t l_tile_data_size which is not sufficient for very large files. The expression l_tile_data_size *= (uint32_t)sizeof(uint32_t); will overflow now after removal of the overflow check. Are you sure that the entire image was decompressed? That seems to be very strange.

stweil avatar Mar 23 '16 20:03 stweil

I managed to decompress the file with a modified OpenJPEG. The process needed up to 16 GB virtual memory and up to 6 GB physical memory (I had added swap space for a total of 24 GB for this test on my machine). The resulting file takes 1.8 GB: -rw-r--r-- 1 stefan stefan 1866783868 Mär 23 22:25 image.tif Which viewer is suggested for such large files?

stweil avatar Mar 23 '16 21:03 stweil

You could use qgis, http://www.qgis.org but for browsing and zooming in and out of the image, you should add some pyramids to the tif first, e.g. using gdal: gdaladdo -ro image.tif 2 4 8 16 32

swalterfub avatar Mar 23 '16 22:03 swalterfub

@stweil Yes, I saw similar memory usage. Since I am on windows, I converted to 12 bit TIFF and used the standard windows viewer. QGIS would of course give a nicer interface.

boxerab avatar Mar 23 '16 23:03 boxerab

This is a 10 bit, mono, 1.5 gigapixel image. So, ideally, memory usage should be approximately

750 MB (compressed image) + 6 GB (uncompressed image stored in 32 bit fixed point)

The fact that memory grows to twice this value indicates a lot of wasted memory in the library.

boxerab avatar Mar 24 '16 11:03 boxerab

@boxerab there is nothing ideal in your description. Memory consumption should be limited to what is actually needed (~code block height x scanline length).

malaterre avatar Mar 24 '16 13:03 malaterre

@malaterre try doing a full resolution forward DWT transform with ( ~code block height x scanline length) memory - can't be done. You need at least (image width * image height * sizeof(int32)) bytes for the decompressed image, pre-DWT.

boxerab avatar Mar 24 '16 14:03 boxerab

@stweil due to memory optimizations, Ronin library uses under 7 GB of memory to decompress this file. Because there is only one tile, it is possible to avoid copying the data to secondary buffers.

boxerab avatar Mar 24 '16 17:03 boxerab

By the way, removing the 32bit int limit causes one unit test to fail. Since only one unit test fails, I think it is safe to remove this limit, but we must investigate why the unit test fails.

The failing test is for issue432.jp2. That file was expected to fail at decoding because of the memory limit. Now, with a higher memory limit, it can be decoded which is unexpected, therefore the test fails.

Decoding issue432.jp2 needs much memory and takes a lot of time: the total test time increased from 248 s to 430 s (maybe because my host only has 6 GB RAM and needs much swapping during the decode process). Decoding will still fail on 32 bit hosts (where the memory limit still applies) and also on 64 bit hosts without enough memory.

stweil avatar Mar 25 '16 09:03 stweil

PR #730 fixes this issue.

stweil avatar Mar 25 '16 09:03 stweil

For issue432.jp2, kakadu reports "illegal inclusion tag tree encountered while decoding a packet header", so we shouldn't rely on artificial memory limit to fail this test. Someone needs to investigate the tag tree.

boxerab avatar Mar 25 '16 13:03 boxerab

Indeed, jpylyzer also reports an invalid JP2. It names two reasons: foundExpectedNumberOfTiles and heightConsistentWithSIZ both are False. These tests should be added to OpenJPEG, too.

stweil avatar Mar 25 '16 16:03 stweil

Awesome, I love jpylyzer. Although, I ran the program through pylint and they got a 5.9/10 code quality score. Anyways, good idea about adding the test. heightConsistentWithSIZ error makes sense: looks like x dim is ~3000 and y dim is exactly 100000. And this is why it was failing the memory check.

boxerab avatar Mar 25 '16 16:03 boxerab

This is now fixed by https://github.com/uclouvain/openjpeg/pull/1010

for issue432.jp2. That file was expected to fail at decoding because of the memory limit. Now, with a higher memory limit, it can be decoded which is unexpected, therefore the test fails.

kakadu reports "illegal inclusion tag tree encountered while decoding a packet header",

It should be decoded reproducably.

In fact it cannot be decoded, only if you will extract the bitstream! https://github.com/uclouvain/openjpeg/issues/432#issuecomment-328206851

895b5a311e96f458e3c058f2f50f4fa3 issue432.jp2_0.pgx (issue432.jp2.pgxis already in the code).

ValZapod avatar May 11 '22 12:05 ValZapod