oletools icon indicating copy to clipboard operation
oletools copied to clipboard

olevba/ftguess: large MHT file incorrectly identified as OpenXML

Open decalage2 opened this issue 2 years ago • 0 comments

Affected tool: olevba, oleid, ftguess

Describe the bug In some cases, large MHT files containing raw binary data with a zip file (e.g. embedded OpenXML) may be incorrectly identified as OpenXML. In that case, the MHT content (MIME parts) is not parsed, and VBA macros may not be detected. Root cause: This is due to zipfile in the Python standard library, which returns True when calling is_zipfile, whereas the file itself is not actually a ZIP file. Note that other zip archive tools such as 7-Zip or Total Commander do not identify the MHT as zip, and refuse to open it as such.

File/Malware sample to reproduce the bug

  • https://mp.weixin.qq.com/s/1L7o1C-aGlMBAXzHqR9udA
  • https://www.netskope.com/blog/abusing-microsoft-office-using-malicious-web-archive-files

How To Reproduce the bug Run ftguess, oleid or olevba.

Expected behavior ftguess, oleid and olevba should identify the file as MHT, and parse it properly.

Version information:

  • OS: Windows
  • OS version: 10 64 bits
  • Python version: 3.9
  • oletools version: 0.60.1.dev1

Additional context Bug reported by email by Aaron Lewis.

Potential solution Change the order of file formats for detection: check MHT before Zip/OpenXML. Verify that it would not lead to other issues, such as an actual OpenXML file detected as MHT if it contains a MIME header before the zip content. Alternative: when zipfile.is_zipfile returns True, check where the zip structure starts.

decalage2 avatar Jan 24 '22 09:01 decalage2