root icon indicating copy to clipboard operation
root copied to clipboard

PyROOT memory leak reading TMap from a file

Open andy-slac opened this issue 3 years ago • 4 comments

  • [X] Checked for duplicates

Describe the bug

We have a long-running service which reads some information from ROOT files and we have noticed that this service regularly grows its memory use until it consumes all available memory and gets killed. I think I have tracked it down to reading one particular object type TMap. We have relatively large TMap object stored in files, the key in TMap is TObjString and values are TObjArray of TObjString. Every time when TMap is read from a file the memory use grows dramatically by many megabytes, even though the Python object is discarded almost immediately after use.

Expected behavior

Memory use should not grow indefinitely, memory should be deallocated when Python object is destroyed.

To Reproduce

I managed to reproduce this issue with a simple script, attached here:

import os
import tempfile

import psutil
import ROOT


def main():

    proc = psutil.Process()

    with tempfile.TemporaryDirectory() as tmpdir:

        file_name = os.path.join(tmpdir, "testfile.root")
        print("Creating  test root file", file_name)
        _make_file_with_tmap(file_name)

        mem0 = proc.memory_info()
        rootfile = ROOT.TFile.Open(file_name, "READ")
        while True:

            ann_obj = rootfile.Get("annotations")
            # ann_obj.DeleteAll()  # DeleteAll seems to slow leak but not 100%
            del ann_obj

            mem1 = proc.memory_info()
            _memstat(mem0, mem1)
            mem0 = mem1


def _make_file_with_tmap(file_name):

    rootfile = ROOT.TFile.Open(file_name, "RECREATE")

    map = ROOT.TMap()

    for i in range(50_000):

        key = ROOT.TObjString("long_histogram_name_" * 10 + str(i))
        value = ROOT.TObjArray()
        value.Add(ROOT.TObjString("annotation"))
        value.Add(ROOT.TObjString(f"10000"))
        map.Add(key, value)

    rootfile.WriteTObject(map, "annotations")
    rootfile.Close()


def _memstat(mem0, mem1):
    MB = 1024 * 1024
    rss_mb = mem1.rss // MB
    vms_mb = mem1.vms // MB
    rss_delta_mb = (mem1.rss - mem0.rss) / MB
    vms_delta_mb = (mem1.vms - mem0.vms) / MB
    print(f"Memory stat: RSS={rss_mb} MB, VMS={vms_mb} MB. Delta RSS={rss_delta_mb:.2f} MB VMS={vms_delta_mb:.2f} MB")


if __name__ == "__main__":
    main()

just setup ROOT from a LCG release and run it, it will print memory use on each iteration, for me it grows by ~35MB on each loop.

Setup

Tested this with ROOT 6.24.06 from LCG_101 and ROOT 6.26.04 from LCG_102 on CentOS7 host (x86_64-centos7-gcc11-opt).

Additional context

andy-slac avatar Aug 03 '22 04:08 andy-slac

Dear @andy-slac , Thank you for reaching out and for the easy reproducer! Indeed, I also see the memory leak with the code you posted. The issue seems to be with TObjArray rather than with TMap itself. In fact, the documentation says that

By default the TObjArray does not own the objects it points to and will not delete them unless explicitly asked (via a call to the Delete member function). To assign ownership of the content to the array, call: myarr->SetOwner(kTRUE);

Thus, if I modify slightly your reproducer as follows:

import os
import tempfile

import psutil
import ROOT


def main():

    proc = psutil.Process()

    with tempfile.TemporaryDirectory() as tmpdir:

        file_name = os.path.join(tmpdir, "testfile.root")
        print("Creating  test root file", file_name)
        _make_file_with_tmap(file_name)

        mem0 = proc.memory_info()
        rootfile = ROOT.TFile.Open(file_name, "READ")
        while True:

            ann_obj = rootfile.Get("annotations")
            ann_obj.DeleteAll()  # This deletes the contents of the current map from memory

            mem1 = proc.memory_info()
            _memstat(mem0, mem1)
            mem0 = mem1


def _make_file_with_tmap(file_name):

    rootfile = ROOT.TFile.Open(file_name, "RECREATE")

    map = ROOT.TMap()

    for i in range(50_000):

        key = ROOT.TObjString("long_histogram_name_" * 10 + str(i))
        value = ROOT.TObjArray()
        value.Add(ROOT.TObjString("annotation"))
        value.Add(ROOT.TObjString(f"10000"))
        value.SetOwner(True) # This makes sure that all the `TObjString`s are owned by the `TObjArray`
        map.Add(key, value)

    rootfile.WriteTObject(map, "annotations")
    rootfile.Close()


def _memstat(mem0, mem1):
    MB = 1024 * 1024
    rss_mb = mem1.rss // MB
    vms_mb = mem1.vms // MB
    rss_delta_mb = (mem1.rss - mem0.rss) / MB
    vms_delta_mb = (mem1.vms - mem0.vms) / MB
    print(f"Memory stat: RSS={rss_mb} MB, VMS={vms_mb} MB. Delta RSS={rss_delta_mb:.2f} MB VMS={vms_delta_mb:.2f} MB")


if __name__ == "__main__":
    main()

I get the following output

Creating  test root file /tmp/tmp8vyocdk_/testfile.root
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=77.00 MB VMS=76.89 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
Memory stat: RSS=325 MB, VMS=1220 MB. Delta RSS=0.00 MB VMS=0.00 MB
[...]

TLDR: The TObjArrays that you were writing to the file did not own the TObjStrings you were giving them. When reading them back, the TObjStrings were leaked into memory because they were not owned by anybody.

Can you confirm that you see the same results as me?

vepadulano avatar Aug 03 '22 21:08 vepadulano

@vepadulano, thanks for suggestion. Does it mean that SetOwner have to be called before writing object to a file? The problem appears when I read data produced by other application, should not ROOT always set ownership for the data which is read from a file? For Python it is quite unnatural to explicitly manage resources, everything should be deallocated when Python object is deleted.

andy-slac avatar Aug 03 '22 21:08 andy-slac

The problem appears when I read data produced by other application

I see, does the other application produce data like in your reproducer above? If so, there is a bug in that application, it doesn't follow the documentation that I linked above. I would suggest to open an issue with the developers in that case.

should not ROOT always set ownership for the data which is read from a file?

I understand the confusion, for the case of objects that were stored on disk it might be better to already have ownership of everything inside of them. But, it depends on the type of data and most importantly on the particular user application. The behaviour of ROOT object ownership may not always be the expected one, but is documented in the manual and in the relevant class references. There are applications that rely on that so it can't always be changed.

For Python it is quite unnatural to explicitly manage resources, everything should be deallocated when Python object is deleted.

I can't say I completely agree with this sentence. Let me break it down:

  • everything should be deallocated when Python object is deleted: do you mean when you call del obj ? Because in general that is not true, a Python object is deleted/deallocated when the garbage collector decides it is the right time. Then, here you don't have pure Python objects, but there are also C++ objects underneath. ROOT takes care of lifetime management of the object it knows about. In the code above the objects are not properly registered.
  • For Python it is quite unnatural to explicitly manage resources: in most cases I would agree. There are some notable exceptions, for example when you manage file I/O or even more when you manage databases (the standard Python sqlite3 library comes to mind). In the case of TFile, the semantics and the behaviour are closer to managing access to a database than reading a text file.

Finally, let me highlight that usage of TObjArray and such is discouraged nowadays, when possible it would be better to migrate to std::vector or similar.

vepadulano avatar Aug 04 '22 07:08 vepadulano

Hi @vepadulano,

I think I have to disagree on all counts. The application that writes the data to a file may have it's own memory management policy and it should not be required for it to always use SetOwner on its data. Memory management in the reading application should be responsibility of the reading application, not writing one.

Your examples of sqlite3 or Python files are misplaced, none of those is leaking memory like ROOT. Again, my point is that whatever Python extension does, it is not supposed to leak memory or require some special code to avoid memory leak. I do understand how Python garbage collection works but that has nothing to do with the memory leak in this case.

And of course it is not always possible to migrate from TObjArray to vector, the files are already written and archived for posteriority, no one is going to update or replace those files.

andy-slac avatar Aug 04 '22 17:08 andy-slac

Hi @andy-slac!

Does it mean that SetOwner have to be called before writing object to a file?

No, you can also do that at any time, also after opening:

            ann_obj = rootfile.Get("annotations")

            for key in ann_obj:
                ann_obj.GetValue(key).SetOwner(True)

The problem appears when I read data produced by other application, should not ROOT always set ownership for the data which is read from a file?

Yes, and it does that indeed! The problem is that there is nested data inside the TMap that is read from the file: the TObjArray that points to some elements. PyROOT takes care of cleaning the TMaps for you. But the actual leak happens because the TObjArray on the C++ side, which has no idea that it should delete the elements it points to unless you tell it to with SetOwner() (yeah manual memory management in C++ is also not good, but these are old classes that can't be changed anymore because they are written to many existing files, as you say yourself).

For Python it is quite unnatural to explicitly manage resources, everything should be deallocated when Python object is deleted.

That's right, but when the underlying C++ objects have no clue what they should do at deletion time because ROOTs memory management is easy to get wrong, PyROOT has not a remote chance to do the right thing.

Closing this because it's not a PyROOT issue, just a user error (a very understandable one, nobody should expect to read the class docs just to get the memory management right :slightly_smiling_face: but hey, that's C++ from the 90s).

guitargeek avatar Apr 03 '24 20:04 guitargeek