oletools icon indicating copy to clipboard operation
oletools copied to clipboard

High RAM usage while find_external_relationships execution

Open baxitaurus opened this issue 2 years ago • 0 comments

Affected tool: ooxml, oletools

Describe the bug While running this piece of code against a xlsm file (4.4MB of size)

xml_parser = ooxml.XmlParser(filepath)
for relationship, target in oleobj.find_external_relationships(xml_parser):
    <do stuff>

I noticed that the execution was stuck on the find_external_relationships call, while my RAM usage was increasing continously. I had to kill the python process after a 15 GB of RAM increase because it was starting to swap. After a bit of inspection I noticed that something was happening during the parsing of the subfile xl/pivotCache/pivotCacheRecords33.xml in the iter_xml call of the XmlParser , which effectively is really heavy when unzipped.

$ ll sample.xlsm 
-rw-rw-r-- 1 user user 4,4M nov 24 16:51 sample.xlsm
$ du -sh unzippedsample
300M	unzippedsample
$ $ ll unzippedsample/xl/pivotCache/pivotCacheRecords33.xml
-rw-rw-r-- 1 user user 167M gen  1  1980 unzippedsample/xl/pivotCache/pivotCacheRecords33.xml

I guess there's some kind of in-memory storage of the elemets coming from this parsing somewhere that is causing this high RAM usage, but it's just a guess, unfortunately I couldn't spend more time in debugging the issue. I'll update the thread if I'll discover something more.

File/Malware sample to reproduce the bug / How To Reproduce the bug The sample that was causing the issue comes from a customer, so I can't share it with you. But I think it could be reproduced building some kind of heavy xls with large data in some subfile..

Version information:

  • OS: Ubuntu 18.04.5 LTS (Bionic Beaver)
  • Python version: 3.6.9
  • oletools version: 0.56

baxitaurus avatar Nov 24 '21 16:11 baxitaurus