CVE-2021-44228_scanner
CVE-2021-44228_scanner copied to clipboard
Detect ZLIB Compression & scan all ZLIB compressed files instead of just *.jar, *.ear, *.war, *.zip
Sequel to #4:
Theoretically relying on the file extension isn't reliable. Check this post by @DidierStevens:
https://isc.sans.edu/forums/diary/Recognizing+ZLIB+Compression/25182/
Unfortunately I couldn't quickly find a way to do this using only native powershell stuff (I'm sure it can be done, possibly via importing some .NET stuff, but that's not something rapidly figured out.), but there does exist a python option:
https://blog.didierstevens.com/2018/10/28/update-file-magic-py-version-0-0-4/
Yeah, it's possible. But I suppose I'd probably want to see an example of a real-world application that is vulnerable to the CVE before making a change that would increase both the time it takes to scan and also the attack surface of the scanner itself.
I'd probably want to see an example of a real-world application that is vulnerable to the CVE
Well, we ain't gonna find such an example if we can't easily hunt for one, but I also agree with everything else you said, so, catch-22...
I'll test it on a local data set to see if there is a delta between what it currently does...
Although now that I think about it, zip detection may only make sense on a top-level container. Because the checkers are inherently recursive in looking for jars within jars, this change would involve extracting every single file from every single archive just to see if it's zip-based. Which I struggle to consider a case where it's worth it.
You can scan every file for the zip magic-number. If it contains this byte sequence anywhere, then there's a very good chance you've got a zip file. Here's my Java code for doing this from my scanner (https://github.com/mergebase/log4j-detector).
Personally I find it surprising how many files are actually zip files. (e.g., did you know *.nupkg are actually zip files !?!)
Over in our scanner we don't end up doing this. We only look for the magic number because Java's ZipInputStream reads from the beginning (most Zip parsers start from the end) and reading from the beginning is a problem with those crazy Spring boot executable jars, so this gives us the number of bytes to skip.
private static boolean isZipSentinel(int[] chunk) {
return chunk[0] == 0x50 && chunk[1] == 0x4B && chunk[2] == 3 && chunk[3] == 4;
}
@juliusmusseau you might want to check that detection code against the post by @DidierStevens which I linked earlier, if I didn't misread (which I might), his covers a few more bases
@no-identd - his code is a bit different, since he's detecting zlib compression, whereas I am trying to detect zip files (which include an index, whereas zlib compression only compresses a single file, and does not include an index).
@juliusmusseau Have you seen a real-world case where log4j is present in a usable form outside of a jar/ear/war/zip ?