joblib
joblib copied to clipboard
timeout semantics unclear
uname -a:
Linux bob 6.1.0-18-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux
python -v:
Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
pip freeze:
joblib==1.3.2
Documentation (https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html) of timeout parameter specifies:
Timeout limit for each task to complete. If any task takes longer a TimeOutError will be raised. Only applied when n_jobs != 1
(1) No specification of what this value actually represents; I guess seconds.
(2) The documentation specifies, that any task needs to run longer than timeout seconds, for an Error to be raised. The example below does not raise an Error, albeit timeout=2 and some items do sleep > 2 seconds. However, if timeout is set to 1 then an Error is raised. (Edit: I would expect an Error to be raised in the example below, too.)
from joblib import Parallel, delayed
import multiprocessing
import time
def run(t):
s, to = t
print(s, to)
time.sleep(to)
return s
l = [('hello', 1), ('world', 2), ('this', 3), ('is', 4), ('a', 5), ('test', 6)]
parallel = Parallel(n_jobs=multiprocessing.cpu_count() - 2, return_as="generator", timeout=2)
for r in parallel(delayed(run)(e) for e in l):
print(r)