pandarallel
pandarallel copied to clipboard
parallel_apply throws AssertionError: daemonic processes are not allowed to have children
Hello,
When trying to do a parallel_apply from a Celery instanciated process (celery worker), this AssertionError is raised:
AssertionError: daemonic processes are not allowed to have children
In this case, I have 3 dataframes to construct in parallel, so I have instanciated 3 Celery workers, each worker will create a df using map functions in parallel_apply, but it seems that it's not supported to call parallel_apply from a daemonic process :
Traceback (most recent call last):
File "/home/yonolo/anaconda3/lib/python3.7/site-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/yonolo/anaconda3/lib/python3.7/site-packages/celery/app/trace.py", line 648, in __protected_call__
return self.run(*args, **kwargs)
File "/home/yonolo/anaconda3/lib/python3.7/site-packages/celery/app/base.py", line 481, in run
raise task.retry(exc=exc, **retry_kwargs)
File "/home/yonolo/anaconda3/lib/python3.7/site-packages/celery/app/task.py", line 703, in retry
raise_with_context(exc)
File "/home/yonolo/anaconda3/lib/python3.7/site-packages/celery/app/base.py", line 472, in run
return task._orig_run(*args, **kwargs)
File "/path/to/my_file.py", line 35, in create_my_df
df['geo'] = df.parallel_apply(lambda row: set_geo(row), axis=1)
File "/home/yonolo/anaconda3/lib/python3.7/site-packages/pandarallel/pandarallel.py", line 355, in closure
manager = Manager()
File "/home/yonolo/anaconda3/lib/python3.7/multiprocessing/context.py", line 56, in Manager
m.start()
File "/home/yonolo/anaconda3/lib/python3.7/multiprocessing/managers.py", line 563, in start
self._process.start()
File "/home/yonolo/anaconda3/lib/python3.7/multiprocessing/process.py", line 110, in start
'daemonic processes are not allowed to have children'
AssertionError: daemonic processes are not allowed to have children
Could you please check, if there is a workaroud ? as Ray framework for example that enables using multiprocessing from a celery worker..
Thanks !
@YonoloX , I ran into this issue too. An issue on the celery repo suggested running the celery worker with a thread pool:
celery worker -P threads
Thanks @mccarthyryanc for your answer, but that works fine only with multithreading, however the problem persists in multiprocessing.
Same problem with multiprocessing.
OK. Find out this is the limit of multiprocessing.
This method works well with concurrent.futures.ProcessPoolExecutor.
Then #164 can also be closed.
If I understand correctly, this can be closed. Feel free to reopen if not.