pymc-bart
pymc-bart copied to clipboard
multiprocessing.Manager() leaves child process alive after deleting Model object.
Hello,
I am not quite sure if this can be considered a bug, but I thought I would share. Feel free to close if this is too much of an edge issue.
I am running a simulation using the bart model that involved creating and deleting the pymc model (with a bart component) in each iteration. I noticed that as I went through iterations I would accumulate python processes that were no longer using CPU but appeared to hold memory(~50-100mb).
When many iterations were done I started having OOM issues due to these processes gradually taking up memory. These processes would die once the main process dies.
These processes were not the multi-chain/multi-thread processes used in the training/inference (those associated processes were spun-up/down correctly).
I believe the issue comes from the multiprocessing.Manager() used to create the 'all_trees' list.
To resolve the issue I used the following codeblock after each iteration was complete.
import multiprocessing as mp
childs = mp.active_children()
for child in childs:
child.kill()
This resolves the issue of lingering processes.
I am not sure if this should be considered bug or not, since it only becomes an issue when a high number of bart models are being created in a single python script. And I don't know if there is really a good general solution to resolving this issue, because if you kill the child process created by the Manager
to early, I would expect there to be issues with further use of the model.
That being said, I could see other users running into this issue if doing a highly iterative process and generally I would say that it having a process that doesn't die when the model is deleted is unexpected behavior. So I just wanted to share my experience for future users reference.
Feel free to close or remove this submission if it is unhelpful.
Thanks!
Hi, thanks for sharing. I think this is a bug even when it will only affect a portion of the users and also this is related to the issues people have been observing on Mac. Not sure of a good general solution either.
One potential solution could be to capture the PID of the manager when it is created (in the BART class). Then add a deconstructor (del) that will kill that process when the class is deleted.
A simple example of this
import multiprocessing as mp
import psutil as ps
# create class
class c1():
def __init__(self):
self.a = 1
manager = mp.Manager()
# collect the pid for the manager
self.process = ps.Process(manager._process.ident)
self.lst = manager.list()
def __del__(self):
print("DELETING PROCESS")
self.process.kill()
def get_process_id(self):
print(self.process)
class c2():
def __init__(self):
self.c11 = c1()
print("CREATED A NEW MANAGER")
print(self.c11.get_process_id())
# create an instance of class with Mangaer
c11 = c1()
# print the process id
print("This is the manager pid")
print(c11.get_process_id())
# print the active children (process id should match)
print("Above should be in this list")
print(mp.active_children())
print("Deleting the object will kill the manager process")
del c11
print("The list shouldn't contain the process")
print(mp.active_children())
# works when class is contained in another class
c22 = c2()
mp.active_children()
del c22
mp.active_children()
As far as I can tell the BART class instance persists through the use of the higher level model instance, so I wouldn't expect there to be any unexpected behavior of this process being killed prior to the deletion of the model instance. And based on this simple example I believe that deletion of the model instance will result in the BART instance to be deleted and the process to be properly killed. But I also am not super familar with all of the PYMC internals, so this approach could also cause some unexpected issues.
Would you like to give it a try and send a PR?
Yeah sure. Might be a few weeks before I can get to it, but I will give it a try.
Thank you! Take your time.