Bypassing Zeek's file extraction processor by twisting Content-Type
Hello Zeek's community!
Zeek (master and 4.1) corrupts extracted PE executable from an HTTP response if the request has a non-expected Content-Type. Here is a proof-of-concept PCAP with two HTTP responses returning the same file but with different Content-Types. The first one with the right Content-Type "application/x-msdownload" and another one with "message/rfc822".
Zeek extracts the first one correctly to:
/example.exe
dcb7bd00e64b07b676e61adc6182801d476b35524e57ac76395ce81dc6e45921
hexdump -C extract-1632487597.528663-HTTP-Fddoxgz03LJucdMWd | head -n 10
00000000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 |MZ..............|
00000010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 |........@.......|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00 |................|
00000040 0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 |........!..L.!Th|
00000050 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f |is program canno|
00000060 74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20 |t be run in DOS |
00000070 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00 |mode....$.......|
00000080 50 45 00 00 4c 01 05 00 b4 3e e0 5a 00 16 00 00 |PE..L....>.Z....|
00000090 f8 01 00 00 e0 00 07 03 0b 01 02 38 00 0a 00 00 |...........8....|
...
The second one extracts to:
/example_twisted.exe
c012669f1ca703bd3d0bc5a85f6c989c46cf923db253ac2ce8d439a847848c6b
hexdump -C extract-1632487612.848029-HTTP-FMpTBX1I0eWzLLvf7f | head -n 10
00000000 24 00 00 00 00 00 00 00 50 45 00 00 4c 01 05 00 |$.......PE..L...|
00000010 b4 3e e0 5a 00 16 00 00 f8 01 00 00 e0 00 07 03 |.>.Z............|
00000020 0b 01 02 38 00 0d 0a 00 00 00 12 00 00 00 02 00 |...8............|
00000030 00 20 12 00 00 00 10 00 00 00 20 00 00 00 00 40 |. ........ ....@|
00000040 00 00 10 00 00 00 02 00 00 04 00 00 00 01 00 00 |................|
00000050 00 04 00 00 00 00 00 00 00 00 60 00 00 00 04 00 |..........`.....|
00000060 00 ec 55 00 00 03 00 00 00 00 00 20 00 00 10 00 |..U........ ....|
00000070 00 00 00 10 00 00 10 00 00 00 00 00 00 10 00 00 |................|
00000080 00 00 00 00 00 00 00 00 00 00 50 00 00 f4 02 00 |..........P.....|
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
...
This is not exactly a bug BUT malware writers can exploit this behavior to hide their second-stage payload from sandboxes extracting payloads from PCAP using Zeek.
Thank you very much and congratulations on this amazing tool!
Cheers, Marcos
Zeek (master and 4.1) corrupts extracted PE executable from an HTTP response if the request has a non-expected Content-Type.
Looks like this comes from the specific content-type that's used here (message/rfc822): If I change the type into something else unknown but not message/*, it works as expected. The HTTP analyzer is special-casing message/* parsing, so it's probably an issue on that code path. Will need some more digging to trace what's going on, but my guess is that it starts parsing the data as a sub-entity, without forwarding the content to file analysis.