loky icon indicating copy to clipboard operation
loky copied to clipboard

Frozen executable - loky (spawn method) does not work

Open samuelstjean opened this issue 4 years ago • 11 comments

I'm wondering if the issue reported in #124 and #125 is fully fixed. I'm using this through joblib 145.2 (which vendors loky 2.6.0 apparently) and it throws me a ton of errors on the default loky backend. I mean by that that limiting to 1 core, so that it uses the sequential backend, works fine, but even adding here the freeze_support line as needed doesn't work.

I'm really not sure if the fault lies here or with pyinstaller, but the same code did use to work using plain old multiprocessing though. Here's what I mean as for the error

screenshot

It looks like it works fine, and then fails to dispatch more jobs. Pyinstaller itself is using a kind of workaround for multiprocessing (see here https://github.com/pyinstaller/pyinstaller/wiki/Recipe-Multiprocessing), so maybe that is interfering with however loky dispatch stuff since it rewrites a bunch of method in the process.

samuelstjean avatar Feb 27 '20 14:02 samuelstjean

It really looks like the frozen version can not deal with some path trickery or rewriting of arguments that happens with loky. On linux it will simply complains I didn't give the correct arguments when it calls the multiprocessed parts. I can hand out a frozen version of the (quite complex) code for any platform if it may help.

samuelstjean avatar Mar 02 '20 16:03 samuelstjean

Indeed loky needs to introspect the python executable and pass specific command line arguments to be able to launch worker processes. Could you please provide a minimal reproduction script that shows how you use pyinstaller and joblib? Or maybe even a reproduction script with just pyinstaller and loky directly?

ogrisel avatar Mar 10 '20 13:03 ogrisel

Well that should be a working example https://gist.github.com/samuelstjean/7286b3377e448b8ca7370bc6dc628fd5 The first comment indicates how to run the whole thing.

If I comment the parser so that it runs without asking for arguments (lines 30 and 31), it won't crash, but it just spawns a lot of processes which stays even after I close the terminal, making the whole computer unresponsive in the process.

If I change the backend to multiprocessing it works as expected.

samuelstjean avatar Mar 10 '20 14:03 samuelstjean

Well I tested it on multiprocessing, threading and loky and it definitely works for the two others (even on linux), so it seems it does something weird possibly on all platforms, rendering it useless in frozen applications. This is what I get (I updated the gist to run all backends by itself)

(testinst) samuel ~ $ ./dist/test aa aa
Entering inner loop
Inner loop finished for backend threading
Inner loop finished for backend multiprocessing
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 17
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 23
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 20
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 19
usage: test [-h] input output
test: error: the following arguments are required: output
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 18
exception calling callback for <Future at 0x7f4a0a1e35d0 state=finished raised TerminatedWorkerError>
Traceback (most recent call last):
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 347, in __call__
    self.parallel.dispatch_next()
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 780, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 847, in dispatch_one_batch
    self._dispatch(tasks)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 765, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 529, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/reusable_executor.py", line 178, in submit
    fn, *args, **kwargs)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 1102, in submit
    raise self._flags.broken
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {EXIT(2), EXIT(2)}
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 21
usage: test [-h] input output
/tmp/_MEIs3o3Wc/joblib/externals/loky/backend/resource_tracker.py:120: UserWarning: resource_tracker: process died unexpectedly, relaunching.  Some folders/sempahores might leak.
test: error: unrecognized arguments: -m --process-name --pipe 22
Traceback (most recent call last):
  File "test.py", line 60, in <module>
    main()
  File "test.py", line 40, in main
    out = estimate_from_dwis()
  File "test.py", line 50, in estimate_from_dwis
    output = Parallel(n_jobs=ncores, verbose=verbose)(delayed(_inner)(data[i]) for i in ranger)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 1042, in __call__
    self.retrieve()
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 921, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 540, in wrap_future_result
    return future.result(timeout=timeout)
  File "anaconda3/envs/testinst/lib/python3.7/concurrent/futures/_base.py", line 435, in result
    return self.__get_result()
  File "anaconda3/envs/testinst/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 347, in __call__
    self.parallel.dispatch_next()
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 780, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 847, in dispatch_one_batch
    self._dispatch(tasks)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 765, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 529, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/reusable_executor.py", line 178, in submit
    fn, *args, **kwargs)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 1102, in submit
    raise self._flags.broken
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {EXIT(2), EXIT(2)}
[10938] Failed to execute script test
(testinst) samuel ~ $ usage: test [-h] input output
test: error: the following arguments are required: output

samuelstjean avatar Jun 27 '20 13:06 samuelstjean

Looks like it will never, ever work for the time being due to loky using spawn however, for all platforms. https://bugs.python.org/issue32146 Not sure if it's worth it anymore as the only thing I could see would be to add a custom freeze_support for all platforms and not just windows (quickly tried it myself for fun, it shot up my ram in a few seconds somehow).

samuelstjean avatar Jul 19 '20 11:07 samuelstjean

Thanks for the reproducer. To get the loky and spawn start methods to work with a frozen executable, we need to be able to generate a commandline to start the worker process using the python interpreter that is embedded into the pyinstaller-generated executable. This is probably related to https://github.com/pyinstaller/pyinstaller/issues/4865. Will need time to investigate the details but unfortunately I don't have much time at hand right now.

ogrisel avatar Sep 10 '20 08:09 ogrisel

Is there a work around for this yet?

Traceback (most recent call last): File "DDP_GUI_JM_Drop_Down_Prototype_11_16_20.py", line 2367, in File "C:\Users\jimmc\Python\Python39\Lib\site-packages\PyInstaller\hooks\rthooks\pyi_rth_multiprocessing.py", line 50, in _freeze_support name, value = arg.split('=') ValueError: not enough values to unpack (expected 2, got 1) [13496] Failed to execute script DDPTraceback (most recent call last): File "DDP_GUI_JM_Drop_Down_Prototype_11_16_20.py", line 2367, in File "C:\Users\jimmc\Python\Python39\Lib\site-packages\PyInstaller\hooks\rthooks\pyi_rth_multiprocessing.py", line 50, in _freeze_support name, value = arg.split('=') ValueError: not enough values to unpack (expected 2, got 1) [2144] Failed to execute script DD_GUI_JM_Drop_Down_PrP_oGtotype_11_16_20 UI_JM_Drop_Down_Prototype_11_16_20 Traceback (most recent call last): File "DDP_GUI_JM_Drop_Down_Prototype_11_16_20.py", line 2240, in on_page_changing File "Semantic_Trend_GUI.py", line 203, in semantic_trend File "pyLDAvis\gensim_models.py", line 125, in prepare File "pyLDAvis_prepare.py", line 442, in prepare File "pyLDAvis_prepare.py", line 278, in _topic_info File "joblib\parallel.py", line 1054, in call File "joblib\parallel.py", line 933, in retrieve File "joblib_parallel_backends.py", line 542, in wrap_future_result File "concurrent\futures_base.py", line 445, in result File "concurrent\futures_base.py", line 390, in __get_result joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

JimMcDonough avatar Jun 14 '21 14:06 JimMcDonough

Dear all, is there some news regarding this issue ? Apparently, using PyInstaller to get an executable from a script using joblib's Parallel still causes the same symptoms reported above. Here is a piece of code reproducing the issue. When processed by PyInstaller, the main method is called endlessly (when using Loky or multiprocessing backends -- it works fine with threading backend, but it does not take advantage of multi-core environment). Using freeze_support does not help, unfortunately. Do you have any hint toward a possible solution ? Thanks in advance.

test.py:

import argparse
from multiprocessing import freeze_support
from joblib import Parallel, delayed

def main():
    arguments_parser = argparse.ArgumentParser()
    arguments_parser.add_argument("-n", default=None, type=int)
    flags, _ = arguments_parser.parse_known_args()

    # safe guard otherwise new processes are spawned forever
    if flags.n is None:
        raise RuntimeError('main() was called again.')

    with Parallel(n_jobs=2, verbose=5) as parallel:
        parallel(delayed(print)(i)for i in range(flags.n))


if __name__ == '__main__':
    freeze_support()
    main()

distribute.sh

pyinstaller \
--noconfirm \
--log-level=WARN \
--onedir \
--nowindow \
test.py

gdoras avatar Nov 16 '21 09:11 gdoras

I never did solve the issue. Sorry.

Sent from my iPhone

On Nov 16, 2021, at 1:24 AM, gdoras @.***> wrote:

 hello, is there any news regarding this issue ? I'm using PyInstaller to get an executable from a script using joblib's Parallel, which causes the issue reported above. Here is a piece of code reproducing the issue (main is called endlessly when using Loky or multiprocessing backends, it works fine with threading backend, but this is not what I'm looking for). Using freeze_support does not help.

import argparse from multiprocessing import freeze_support from joblib import Parallel, delayed

def main(): arguments_parser = argparse.ArgumentParser() arguments_parser.add_argument("-n", default=None, type=int) flags, _ = arguments_parser.parse_known_args()

# safe guard otherwise new processes are spawned forever
if flags.n is None:
    raise RuntimeError('main() was called again.')

with Parallel(n_jobs=2, verbose=5) as parallel:
    parallel(delayed(print)(i)for i in range(flags.n))

if name == 'main': freeze_support() main() — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

JimMcDonough avatar Nov 16 '21 17:11 JimMcDonough

Dear all, is there some news regarding this issue ? Apparently, using PyInstaller to get an executable from a script using joblib's Parallel still causes the same symptoms reported above. Here is a piece of code reproducing the issue. When processed by PyInstaller, the main method is called endlessly (when using Loky or multiprocessing backends -- it works fine with threading backend, but it does not take advantage of multi-core environment). Using freeze_support does not help, unfortunately. Do you have any hint toward a possible solution ? Thanks in advance.

test.py:

import argparse
from multiprocessing import freeze_support
from joblib import Parallel, delayed

def main():
    arguments_parser = argparse.ArgumentParser()
    arguments_parser.add_argument("-n", default=None, type=int)
    flags, _ = arguments_parser.parse_known_args()

    # safe guard otherwise new processes are spawned forever
    if flags.n is None:
        raise RuntimeError('main() was called again.')

    with Parallel(n_jobs=2, verbose=5) as parallel:
        parallel(delayed(print)(i)for i in range(flags.n))


if __name__ == '__main__':
    freeze_support()
    main()

distribute.sh

pyinstaller \
--noconfirm \
--log-level=WARN \
--onedir \
--nowindow \
test.py

I have been struggled with same issue for a long time. Have you found a solution?

heygy avatar Jan 03 '23 04:01 heygy

Did this need more help in testing or something else? I'd say that anyone wanting to ship binary stuff simply can't use joblib with loki right now because of that, and I'd say that the dask backend also has some freezing issues somehow.

samuelstjean avatar Jan 26 '24 20:01 samuelstjean