oletools icon indicating copy to clipboard operation
oletools copied to clipboard

extract_macros (of VBA_PARSER) doesn't extract macrosheet code anymore

Open eyaltemps opened this issue 2 years ago • 8 comments

I'm not sure if this is a bug, or I'm missing a new feature or a specific action I should make, so i'll open it as a bug.

Affected tool: olevba

Bug description: 0.56.2 oletools version extracts macrosheets macro code by default when using "extract_macros()" but 0.60 oletools version doesn't.

File to reproduce the bug (password: Password1): food1.zip

How To Reproduce the bug: Python:

vbaparser = VBA_Parser(file_path)
for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
      print(vba_code)

CLI: Issue can be also seen with CLI command: olevba -jc {FILE_PATH}

Expected behavior Extracted macro code will contain macrosheet: "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<xm:macrosheet.............." as happens in 0.56.2

Screenshots: 0 56olevba 0 60olevba

How can I make vbaparser.extract_macros() extract also the macrosheet code, as it worked in 0.56.2?

eyaltemps avatar Dec 28 '21 12:12 eyaltemps

Any advice to handle the issue with latest Olevba will be appreciated

eyaltemps avatar Feb 14 '22 15:02 eyaltemps

Hi @eyaltemps, indeed in 0.60 I made important changes on how XLM macros are extracted. I integrated XLMMacroDeobfuscator because it gives much better results than plugin_biff (thanks to emulation), and it supports more file formats. I removed the old code that was extracting XLM macros from xlsm because it was just giving raw XML. This is why you see no macro in your case. If you install XLMMacroDeobfuscator it should fix your issue.

However, by default XLMMacroDeobfuscator is not installed by pip. You can either install it separately (see https://github.com/DissectMalware/XLMMacroDeobfuscator) or simply update oletools with this command:

pip install -U oletools[full]

decalage2 avatar Feb 21 '22 22:02 decalage2

I need to check if olevba could fall back to the old code for XLSM if XLMMacroDeobfuscator is not available.

decalage2 avatar Feb 21 '22 22:02 decalage2

Hi @decalage2 ,

Thank you for your response. I tried to extract those macros with the new XLMMacroDeobfuscator, but I couldn't get any expected results. Can you share your advice on that?

I ran the following code:

from XLMMacroDeobfuscator.deobfuscator import process_file
result = process_file(file="C:\shared\food1.xlsm",
                          noninteractive=True,
                          noindent=True,
                          output_formula_format='[[CELL-ADDR]], [[INT-FORMULA]]',
                          return_deobfuscated=True,
                          timeout=30)
print("result is: ", result)

The printed results in the console are:

File: C:\shared\food1.xlsm

Unencrypted document or unsupported file format
Unencrypted xlsm file

[Loading Cells]
[Starting Deobfuscation]
[END of Deobfuscation]
time elapsed: 0.1169731616973877
**result is:  []**

I.e, I got an empty result.

The file is attached, and the expected behavior for me would be to be able to detect the XLM macro via python project (as I could detect using 0.56.2 olevba) . food1.zip (Password1)

EDIT: I will mention that although I used the command pip install -U oletools[full], and had the XLMMacroDeobfuscator, olevba still didn't extracted the XLM as it did in 0.56.2 (that is why I tried to use "process_file" that I presented above): Used the following code:

vbaparser = VBA_Parser(file_path)
for (filename, stream_path, vba_filename, vba_code) in vbaparser.extract_macros():
      print(vba_code)

Thanks!

eyaltemps avatar Feb 22 '22 15:02 eyaltemps

The issue is actually due to a Unicode error when running XLMMacroDeobfuscator:

xlmdeobfuscator -c food1.xlsm
XLMMacroDeobfuscator: defusedxml is not installed (required to securely parse XLSM files)

XLMMacroDeobfuscator(v0.2.5) - https://github.com/DissectMalware/XLMMacroDeobfuscator

Traceback (most recent call last):
  File "c:\program files\python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\program files\python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Program Files\Python39\Scripts\xlmdeobfuscator.exe\__main__.py", line 7, in <module>
  File "C:\Users\xxx\AppData\Roaming\Python\Python39\site-packages\XLMMacroDeobfuscator\deobfuscator.py", line 3125, in main
    defaults = json.load(config_file)
  File "c:\program files\python39\lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
  File "c:\program files\python39\lib\codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf3 in position 14: invalid continuation byte

I will ask @DissectMalware to check this sample.

decalage2 avatar Feb 22 '22 21:02 decalage2

@decalage2 Thank you, Regardless the deobfuscation, is there any way to output the raw XML as it was in 0.56? If not with Olevba, would you advice any other tool?

eyaltemps avatar Feb 23 '22 11:02 eyaltemps

I plan to reintegrate the old code that was extracting the raw xml as fallback, but I will need some time to do it. In the meantime, you can still use olevba 0.56 if it works better for you.

decalage2 avatar Feb 23 '22 20:02 decalage2

The macrosheet seems to only have two formulas

image

image

DissectMalware avatar Feb 24 '22 06:02 DissectMalware