CImg
                                
                                
                                
                                    CImg copied to clipboard
                            
                            
                            
                        Undefined behaviour and correctness discrepancy while parsing a BMP file
Hi CImg :wave:
We found an undefined behaviour bug in CImg triggered when parsing a BMP file. The BMP is correct according to CImg; no exceptions or crashes when parsed without sanitisers. However, ImageMagick's identify flags the BMP as incorrect.
After debugging, we understand this correctness discrepancy is triggering the undefined behaviour.
Bug
CImg.h:54462:39: runtime error: signed integer overflow: 1048616 - -2147483608 cannot be represented in type 'int' 
The UB happens in function CImg<T>& _load_bmp(std::FILE *const file, const char *const filename)   and the line is:
const int xoffset = offset - 14 - header_size - 4*nb_colors;
offset has a value of 1048630 and header_size of -2147483608 while file size is 238.
Possible fix
offset and header_size could be compared to file_size to check that they are within range.
If this sounds reasonable, I can provide a PR for it.
How to reproduce:
#include "CImg.h"
using namespace cimg_library;
int main(int argc,char **argv) {
  const char *const file_i = cimg_option("-i",(char*)0,"Input image");  
  CImg<unsigned char> image(file_i);
  return 0;
}
testcase.zip (Github does not allow uploading BMP files)
This has been tested with clang++-14.
unzip testcase.zip testcase.bmp
clang++ -g -O0 -fsanitize=undefined,address -fno-sanitize-recover=all -Dcimg_display=0 main.cpp
./a.out -i testcase.bmp
Discrepancy with identify
identify -verbose testcase.bmp reports
identify-im6.q16: length and filesize do not match testcase.bmp @ error/bmp.c/ReadBMPImage/854. 
identify-im6.q16: insufficient image data in file testcase.bmp @ error/bmp.c/ReadBMPImage/1001.
We have discovered additional BMP files triggering discrepancies with identify, like above, meaning that CImg may wrongly consider them correct. Would you find it useful if we report them?
Thanks for sharing this project!
Best, Manuel.
Thanks for reporting. I've made a fix attempt : https://github.com/GreycLab/CImg/commit/55da1f4dc8256aa8846036a45d5dbb2bab1466c2 Could you tell me if that looks good?
Thanks for the quick response!
I think a fix should take this into account:
- A malformed BMP can provide at least inconsistent values for the  
offset,header_sizeandfile_sizevariables. Those variables can come from the input file's content. - A wrong combination of 
offsetandheader_sizecan lead to the reported UB. To fix this, we could check if they are in range with the file's size. 
I would propose a fix which performs these checks in this particular order:
- Sanitize 
file_sizeso it matches the actual file size. - Check that 
offsetandheader_sizeare consistent with the now sanitizedfile_size. 
From what I grasp, the fix is not sanitizing file_size. Also, it attempts to sanitize header_size using an unsanitized file_size. Last, it is not checking offset.
Here is my fix (on top of yours):
diff --git a/CImg.h b/CImg.h
index b5a934d..21db640 100644
--- a/CImg.h
+++ b/CImg.h
@@ -54419,6 +54419,11 @@ namespace cimg_library {
 
       const ulongT fsiz = file?(ulongT)cimg_max_buf_size:(ulongT)cimg::fsize(filename);
       std::FILE *const nfile = file?file:cimg::fopen(filename,"rb");
+      
+      cimg::fseek(nfile,0,SEEK_END);
+      cimg_long filesystem_size=cimg::ftell(nfile);
+      std::rewind(nfile);
+
       CImg<ucharT> header(54);
       cimg::fread(header._data,54,nfile);
       if (*header!='B' || header[1]!='M') {
@@ -54440,17 +54445,25 @@ namespace cimg_library {
         nb_colors = header[0x2E] + (header[0x2F]<<8) + (header[0x30]<<16) + (header[0x31]<<24),
         bpp = header[0x1C] + (header[0x1D]<<8);
 
+      if (file_size != filesystem_size){
+        throw CImgIOException(_cimg_instance
+                              "load_bmp(): Invalid file_size %i specified in filename '%s', expected %i.",
+                              cimg_instance,
+                              file_size,filename?filename:"(FILE*)", filesystem_size);
+      }
+
       if (header_size<0 || header_size>file_size)
         throw CImgIOException(_cimg_instance
                               "load_bmp(): Invalid header size %d specified in filename '%s'.",
                               cimg_instance,
                               header_size,filename?filename:"(FILE*)");
 
-      if (!file_size || file_size==offset) {
-        cimg::fseek(nfile,0,SEEK_END);
-        file_size = (int)cimg::ftell(nfile);
-        cimg::fseek(nfile,54,SEEK_SET);
-      }
+      if (offset<0 || offset>file_size)
+        throw CImgIOException(_cimg_instance
+                              "load_bmp(): Invalid offset %i specified in filename '%s'.",
+                              cimg_instance,
+                              offset,filename?filename:"(FILE*)");
+
       if (header_size>40) cimg::fseek(nfile,header_size - 40,SEEK_CUR);
 
       const int
From what I tested, this fixes the UB and should also fix the first reported discrepancy with identify.
The second discrepancy with ImageMagick seems to be about the image pixels' data. ImageMagick ensures that the total number of pixels read is consistent with the width and height.
I could also try to provide a patch if you like.
Best, Manuel.
Thanks. Here is my proposal then :
https://github.com/GreycLab/CImg/commit/009f0432fd32655f9399f2ec2fd5380c44e49a8e
Variable fsize actually already estimates the file size, with the std::tell() procedure.
Thanks!
It looks good to me for the case in which a file path is provided instead of a FILE*.
The issue is with fsiz is that it is computed as const ulongT fsiz = file?(ulongT)cimg_max_buf_size:(ulongT)cimg::fsize(filename);.
If a file pointer is provided (file is not NULL), you could still have issues similar to those reported here.
fsiz is set to cimg_max_buf_size, which could be higher than the actual size, thus allowing incorrect values for offset and header_size again.  cimg_max_buf_size seems high enough to allow the UB to happen again.
Best, Manuel.
Indeed, so https://github.com/GreycLab/CImg/commit/703ffcaf470d426e333f8349760957119590d38a should help.
Yes, it looks good to me in regard to the UB :+1:
A minor comment would be that the new fsize is not restoring the original file position. In our case, this is not an issue but it may be if used elsewhere.
I would save the position at the beginning long originalPos = std::ftell(file); and restore before returning     std::fseek(file, originalPos, SEEK_SET);
Best, Manuel.
Indeed, thanks ! I've made the changes.
Regards,
David.