pyelftools icon indicating copy to clipboard operation
pyelftools copied to clipboard

OOM on DWARF's DIE traversal.

Open bieganski opened this issue 1 year ago • 7 comments

On my 32GB RAM + 8GB SWAP I get OOM somewhere in the middle of iter_CUs() and die.iter_children() of an ELF with 2GB DWARF section.

I haven't investigated the issue thoroughly, but it looks like the pyelftools keeps references to all the DIEs ever fetched from disk? Inserting gc.collect() after each CU handling does not help.

Is there an easy workaround for that?

bieganski avatar Nov 07 '24 23:11 bieganski

Sorry, there isn't. pyelftools caches aggressively. We have some vague plans for implementing a less memory hungry mode, but nothing written down.

sevaa avatar Nov 08 '24 15:11 sevaa

thank you for clear respone, appreciate :+1:

bieganski avatar Nov 09 '24 20:11 bieganski

We are revisiting this. What is the source of your binary, please, and what exactly are you doing with it?

sevaa avatar Jan 02 '25 15:01 sevaa

i'm working on a proprietary library that i cannot share unfortunately, but i can tell that we compile it using GCC14.

the code that i execute looks as follows:


for i, cu in enumerate(elf.get_dwarf_info().iter_CUs()):
      die = cu.get_top_DIE()
      recursive_dump(die)

recursive_dump iterates over all DIE children (die.iter_children()) (accesses attributes and possibly resolves DWARF references using die.get_DIE_from_attribute(...), then calls itself (recursive_dump) on each child. our program (the one that dumps dwarf) is almost stateless on our side (besides recursion stack, but it's probably neglectible, as Python limits stack by default) - we don't keep any structures, everything goes to stdout based on some conditions. so the high memory usage is due to pyelftools allocations.

meanwhile i upgraded my PC to 64GB RAM, and i can tell that the peak usage of my Python script is 59 GB. in order to provide some statistics, i added following code just after the loop mentioned above, in order to print gc info where the memory usage is at it's highest (59GB):

import gc
import sys

print(f"sys.getsizeof(DIE) = {sys.getsizeof(sample_die := cu.get_top_DIE())}", file=sys.stderr)
print(f"sys.getsizeof(AttributeValue) = {sys.getsizeof(next(iter(sample_die.attributes)))}", file=sys.stderr)
print(f"total objects alive before last GC: {len(gc.get_objects())}", file=sys.stderr)
print(f"total number of alive DIE objects: {len([x for x in gc.get_objects() if isinstance(x, DIE)])}", file=sys.stderr)
print(f"total number of alive AttributeValue objects: {len([x for x in gc.get_objects() if isinstance(x, AttributeValue)])}", file=sys.stderr)

from collections import defaultdict
a = defaultdict(lambda: 0)
for x in gc.get_objects():
    a[type(x)] += 1
from operator import itemgetter
print(sorted([(v, k) for k, v in a.items()][:10], key=itemgetter(0), reverse=True), file=sys.stderr)

and here is the output:

sys.getsizeof(DIE) = 48
sys.getsizeof(AttributeValue) = 63
total objects alive before last GC: 249930042
total number of alive DIE objects: 30795816
total number of alive AttributeValue objects: 144745192
[(144745192, <class 'elftools.dwarf.die.AttributeValue'>), (32688887, <class 'dict'>), (30795816, <class 'elftools.dwarf.die.DIE'>), (30795816, <class 'collections.OrderedDict'>), (956698, <class 'list'>), (4613, <class 'function'>), (2189, <class 'tuple'>), (873, <class 'builtin_function_or_method'>), (176, <class 'module'>), (175, <class '_frozen_importlib.ModuleSpec'>)]

so it seems that 99% of all allocations are due to AttributeValue, dict, DIE and OrderedDict.

i'm not memory profiling expert so i might have confused something; in that case let me know, and i will provide more useful data.

bieganski avatar Jan 03 '25 10:01 bieganski

for completeness i slightly modified the code above to show total bytes allocated, not number of objects (again, grouped by object type):

a = defaultdict(lambda: 0)
for x in gc.get_objects():
    a[type(x)] += sys.getsizeof(x)

here is the result:

[(18005255248, <class 'collections.OrderedDict'>), (12737576896, <class 'elftools.dwarf.die.AttributeValue'>), (7494119328, <class 'dict'>), (1478199168, <class 'elftools.dwarf.die.DIE'>), (647969864, <class 'list'>), (664272, <class 'function'>), (129808, <class 'tuple'>), (62856, <class 'builtin_function_or_method'>), (12672, <class 'module'>), (8400, <class '_frozen_importlib.ModuleSpec'>)]

my conclusions (to be double-checked):

  • OrderedDict, AttributeValue, dict, DIE, list stand for 18GB, 12.7GB, 7.5GB, 1.5GB, 0.6GB respectively
  • this is around 40GB in total, out of 55GB that is used by my script (59GB mentioned somewhere above was whole OS, not a single process)
  • i don't create any dict nor list in my script, it is somewhere inside pyelftools as well i guess?
  • i don't know the source of remaining 15GB, possibly memory fragmentation?

bieganski avatar Jan 03 '25 11:01 bieganski

OBTW, did you find a workaround for your case? While there is no official low memory mode in the public API, it's possible to reduce memory consumption if you mess with pyelftools' internals.

sevaa avatar Jan 07 '25 00:01 sevaa

nope, since i upgraded my PC with more RAM and i no longer have OOM, i stopped investigating the issue (only spent some time to provide you more data)

bieganski avatar Jan 08 '25 07:01 bieganski