codeql icon indicating copy to clipboard operation
codeql copied to clipboard

C++: Decompression Bombs

Open am0o0 opened this issue 2 years ago • 22 comments

This is part of All for one, one for all query submission, I'm going to submit an issue in github/securitylab for this pull request too. this query will be upgraded more in this week whether in this pull request or in another pull request, currently I'm adding minizip, xz, bzip2, zstd, LZ4 related libraries. I'm trying my best to add proper sanitizers too if there is any protection for each function as some functions are unsafe by default and can not be controlled during decompression.

am0o0 avatar Jun 25 '23 10:06 am0o0

Anyone not watching the repo in general, note this is part of a family of submissions:

https://github.com/github/codeql/pull/13553 https://github.com/github/codeql/pull/13554 https://github.com/github/codeql/pull/13555 https://github.com/github/codeql/pull/13556 https://github.com/github/codeql/pull/13557 https://github.com/github/codeql/pull/13558 https://github.com/github/codeql/pull/13560

am0o0 avatar Jun 25 '23 10:06 am0o0

QHelp previews:

cpp/ql/src/experimental/Security/CWE/CWE-409/DecompressionBombs.qhelp

User-controlled file decompression

Extracting Compressed files with any compression algorithm like gzip can cause denial of service attacks.

Attackers can compress a huge file consisting of repeated similiar bytes into a small compressed file.

Recommendation

When you want to decompress a user-provided compressed file you must be careful about the decompression ratio or read these files within a loop byte by byte to be able to manage the decompressed size in each cycle of the loop.

Example

Reading an uncompressed Gzip file within a loop and check for a threshold size in each cycle.

#include "zlib.h"

void SafeGzread(gzFile inFileZ) {
    const int MAX_READ = 1024 * 1024 * 4;
    const int BUFFER_SIZE = 8192;
    unsigned char unzipBuffer[BUFFER_SIZE];
    unsigned int unzippedBytes;
    unsigned int totalRead = 0;
    while (true) {
        unzippedBytes = gzread(inFileZ, unzipBuffer, BUFFER_SIZE);
        totalRead += unzippedBytes;
        if (unzippedBytes <= 0) {
            break;
        }

        if (totalRead > MAX_READ) {
            // Possible decompression bomb, stop processing.
            break;
        } else {
            // process buffer
        }
    }
}

The following example is unsafe, as we do not check the uncompressed size.

#include "zlib.h"

void UnsafeGzread(gzFile inFileZ) {
    const int BUFFER_SIZE = 8192;
    unsigned char unzipBuffer[BUFFER_SIZE];
    unsigned int unzippedBytes;
    while (true) {
        unzippedBytes = gzread(inFileZ, unzipBuffer, BUFFER_SIZE);
        if (unzippedBytes <= 0) {
            break;
        }

        // process buffer
    }
}

References

github-actions[bot] avatar Jun 25 '23 18:06 github-actions[bot]

Hi, I've completed the work on this query and I don't have any further updates/commits here.

am0o0 avatar Jul 03 '23 21:07 am0o0

Hi, currently for most of the libraries there are no sanitizers because I didn't know how can I set a limit on output buffer length for many of them, I think there is no way to control the output resource for some methods of these libs, the other problem is that being able to write proper sanitizer too which I succeeded to write one for gzread method. it seems that there must be sanitizers otherwise every usage of these decompression methods can be mark as vulnerable, however I think many of them are not checking the decompressed output size. also as there are some void* as source in taint tracking config, I must find the longest path that the inner nodes of this path has not appeared in other paths as source. I got help from one of your colleges but I didn't implement this yet.

am0o0 avatar Aug 10 '23 17:08 am0o0

@am0o0 how is this going? would you like me to open this PR to get the CodeQL team to look at it?

Kwstubbs avatar May 24 '24 05:05 Kwstubbs

@Kwstubbs yes, I can also open this PR if you can't.

am0o0 avatar May 24 '24 06:05 am0o0

@am0o0 sounds good please do

Kwstubbs avatar May 24 '24 06:05 Kwstubbs

pinging @codeql-c-analysis query is ready for review.

Kwstubbs avatar Jun 24 '24 21:06 Kwstubbs

@jketema this directory doesn't exist:

The query should be relocated to cpp/ql/src/experimental/query-tests/Security/CWE/CWE-409

am0o0 avatar Jun 25 '24 15:06 am0o0

@jketema this directory doesn't exist:

The query should be relocated to cpp/ql/src/experimental/query-tests/Security/CWE/CWE-409

Feel free to create it as part of this PR.

jketema avatar Jun 25 '24 15:06 jketema

Hi, as I don't have enough experience with C++ package/module systems, It takes some time to implement tests.

am0o0 avatar Jun 26 '24 10:06 am0o0

Hi, as I don't have enough experience with C++ package/module systems, It takes some time to implement tests.

Unfortunately you can't import external code into the C++ language tests, they have to be self-contained (so that they can run reliably anywhere).

You should be able to copy-paste the example code into a new test directory; delete the #includes and stub any library functions / types it uses instead (so that it compiles in isolation); then add a .qlref referencing the query; and get the test runner to generate a .expected file containing the output as it is currently. Such a test will help us all understand where the query is - what it's doing well and any areas it falls short (which is OK to expose in a test) - and the effect of changes made to the query.

I'm happy to talk you through this process step-by-step, or help if you get stuck.

geoffw0 avatar Jun 26 '24 14:06 geoffw0

@geoffw0 thank you a lot for helping me on this! can I start writing tests based on this test file "cpp/ql/test/experimental/query-tests/Security/CWE/CWE-078/test.cpp"?

am0o0 avatar Jun 26 '24 16:06 am0o0

can I start writing tests based on this test file "cpp/ql/test/experimental/query-tests/Security/CWE/CWE-078/test.cpp"?

I don't think there's much you'll need from that particular file besides perhaps a stub definition of size_t, but yes, you can use what you find in the other tests such as this one. You don't need to provide a main by the way - the .cpp files in tests are never actually run, only compiled and analyzed. So in practice you'll probably just have the code from your existing examples + stubs of stuff like strlen, z_stream etc so it compiles in isolation (none of the stubs need to do anything besides having any details your query is actually looking for).

geoffw0 avatar Jun 26 '24 17:06 geoffw0

@geoffw0 Hii, I need a little help now, before asking help I'm sorry for the delay, I forgot to continue working on this PR for a while :))

the stub is OK but I can't get the expected results, how can I create a codeql database from stubs?

am0o0 avatar Jul 14 '24 19:07 am0o0

I simply run codeql database create --language=cpp --source-root=. DB --overwrite in a directory that contains the cpp file with stubs but I'm encountering with following errors:


[2024-07-15 08:17:54] [build-stdout] [ 50%] Linking CXX executable bombsMain
[2024-07-15 08:17:54] [build-stdout] /home/am/.local/lib/python3.10/site-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/bombsMain.dir/link.txt --verbose=1
[2024-07-15 08:17:54] [build-stdout] /home/am/Cpp/Bombs_Zlib/DB/working/autobuild/bin/c++ -O3 -DNDEBUG CMakeFiles/bombsMain.dir/main.cpp.o CMakeFiles/bombsMain.dir/zlibTest.cpp.o CMakeFiles/bombsMain.dir/test.cpp.o -o bombsMain
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o: in function `_start':
[2024-07-15 08:17:54] [build-stdout] (.text+0x1b): undefined reference to `main'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: CMakeFiles/bombsMain.dir/zlibTest.cpp.o: in function `UnsafeInflate(int, char**)':
[2024-07-15 08:17:54] [build-stdout] zlibTest.cpp:(.text+0xc6): undefined reference to `deflateInit(z_stream*, int)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0xd3): undefined reference to `deflate(z_stream*, int)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0xdb): undefined reference to `deflateEnd(z_stream*)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x118): undefined reference to `inflateInit(z_stream*)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x122): undefined reference to `inflate(z_stream*, int)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x12a): undefined reference to `inflateEnd(z_stream*)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: CMakeFiles/bombsMain.dir/zlibTest.cpp.o: in function `UnsafeGzread(char**)':
[2024-07-15 08:17:54] [build-stdout] zlibTest.cpp:(.text+0x1a4): undefined reference to `send(int, void const*, int, int)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x1ba): undefined reference to `gzopen(char*, char const*)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x1c7): undefined reference to `operator<<(std::ostream const&, unsigned char)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x1d4): undefined reference to `gzread(gzFile, unsigned char*, int)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x1dd): undefined reference to `gzclose(gzFile)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: CMakeFiles/bombsMain.dir/zlibTest.cpp.o: in function `UnsafeGzfread(char**)':
[2024-07-15 08:17:54] [build-stdout] zlibTest.cpp:(.text+0x23f): undefined reference to `send(int, void const*, int, int)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x24e): undefined reference to `gzopen(char*, char const*)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x266): undefined reference to `gzfread(char*, int, int, gzFile)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x26f): undefined reference to `gzclose(gzFile)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: CMakeFiles/bombsMain.dir/zlibTest.cpp.o: in function `UnsafeGzgets(char**)':
[2024-07-15 08:17:54] [build-stdout] zlibTest.cpp:(.text+0x2b6): undefined reference to `send(int, void const*, int, int)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x2c5): undefined reference to `gzopen(char*, char const*)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x2df): undefined reference to `gzgets(gzFile, char*, int)'
[2024-07-15 08:17:54] [build-stdout] /usr/bin/ld: zlibTest.cpp:(.text+0x2f1): undefined reference to `gzgets(gzFile, char*, int)'
[2024-07-15 08:17:54] [build-stdout] collect2: error: ld returned 1 exit status
[2024-07-15 08:17:54] [build-stdout] make[2]: *** [CMakeFiles/bombsMain.dir/build.make:132: bombsMain] Error 1
[2024-07-15 08:17:54] [build-stdout] make[2]: Leaving directory '/home/am/Cpp/Bombs_Zlib/_codeql_build_dir'
[2024-07-15 08:17:54] [build-stdout] make[1]: *** [CMakeFiles/Makefile2:86: CMakeFiles/bombsMain.dir/all] Error 2
[2024-07-15 08:17:54] [build-stdout] make[1]: Leaving directory '/home/am/Cpp/Bombs_Zlib/_codeql_build_dir'
[2024-07-15 08:17:54] [build-stdout] make: *** [Makefile:94: all] Error 2
[2024-07-15 08:17:54] [build-stdout] ~/Cpp/Bombs_Zlib
[2024-07-15 08:17:54] [build-stderr] cpp/autobuilder: No supported build command succeeded.
[2024-07-15 08:17:54] [build-stderr] cpp/autobuilder: autobuild summary.
[2024-07-15 08:17:55] [ERROR] Spawned process exited abnormally (code 1; tried to run: [/home/am/CodeQL-home/tools/linux64/preload_tracer, /home/am/CodeQL-home/cpp/tools/autobuild.sh])
A fatal error occurred: Exit status 1 from command: [/home/am/CodeQL-home/cpp/tools/autobuild.sh]

am0o0 avatar Jul 15 '24 06:07 am0o0

You want to run the tests you're adding with codeql test run instead of creating a database directly. See: https://docs.github.com/en/code-security/codeql-cli/codeql-cli-manual/test-run

jketema avatar Jul 15 '24 06:07 jketema

@jketema I need to create a database to debug my query.

am0o0 avatar Jul 15 '24 06:07 am0o0

Right, in that case, you'll need to specify an explicit build command to codeql database create with the -c flag. That build command should not attempt any linking.

jketema avatar Jul 15 '24 07:07 jketema

@jketema thank you a lot! the sources were incorrect and I fixed it now, also sources and sinks can be found but the path query is still not working, I should debug this ASAP. if you want I can push the new test cases now.

am0o0 avatar Jul 15 '24 08:07 am0o0

Yep, push the test cases so we can see what's going on / if anything doesn't look right.

geoffw0 avatar Jul 15 '24 10:07 geoffw0

as you can see in this screenshot, the additional flow step doesn't work for gzopen. ( I set the isSink to any() so I can debug the query) image

am0o0 avatar Jul 30 '24 11:07 am0o0

Hi, I decided to remove some libraries because the libraries don't have enough popularity and/or are outdated and don't have proper documentation. Also, the main reason is that I don't have enough time.

Also, I failed to finalize the ZSTD tests because I thought I needed a special flow step that I couldn't write.

am0o0 avatar Sep 03 '24 13:09 am0o0

Also, I failed to finalize the ZSTD tests because I thought I needed a special flow step that I couldn't write.

Do you mean you included it, but that it doesn't work as expected at the moment?

jketema avatar Sep 03 '24 14:09 jketema

Do you mean you included it, but that it doesn't work as expected at the moment?

yes

am0o0 avatar Sep 03 '24 16:09 am0o0

@am0o0 I've opened https://github.com/am0o0/codeql/pull/1 to do some clean up on this. If you're happy with the changes there, please merge and the changes should propagate to here. Given that you've indicated that you have little time, and since the bounty program is ending, this seems to be the best way forward to get this PR merged.

Also, I failed to finalize the ZSTD tests because I thought I needed a special flow step that I couldn't write.

Could you point me to the part of the code where you need an additional flow step, then I can have a look.

jketema avatar Sep 04 '24 12:09 jketema

@jketema thank you alot for helping on this PR!

am0o0 avatar Sep 04 '24 14:09 am0o0

You're welcome. How about the following?

Also, I failed to finalize the ZSTD tests because I thought I needed a special flow step that I couldn't write.

Could you point me to the part of the code where you need an additional flow step, then I can have a look.

jketema avatar Sep 04 '24 15:09 jketema

@jketema I created a review of the tests. I mentioned you to receive a notification. IDK why the review doesn't exist in the main in here.

am0o0 avatar Sep 04 '24 15:09 am0o0

@jketema I created a review of the tests. I mentioned you to receive a notification. IDK why the review doesn't exist in the main in here.

I haven't received any notifications (as of yet).

jketema avatar Sep 04 '24 15:09 jketema