scikit-optimize icon indicating copy to clipboard operation
scikit-optimize copied to clipboard

Deadlock on multiprocessing gp_minimize

Open bkb3 opened this issue 3 years ago • 1 comments

Using gp_minimize on different processes (multiprocessing) causes a deadlock. Here's a simple example code to reproduce the issue.

from time import time
import numpy as np
from skopt import gp_minimize


def eggholder(x):
    # test optimisation function
    return (-(x[1] + 47) * np.sin(np.sqrt(abs(x[0]/2 + (x[1]  + 47))))
            -x[0] * np.sin(np.sqrt(abs(x[0] - (x[1]  + 47)))))

def test_f(arg):
    # wrapper around gp_minimize to return optimised values and cost
    # arg just changes the random state for this simple example
    res = gp_minimize(eggholder, dimensions=[(-512, 512), (-512, 512)], x0=[0., 0],
                      n_calls=10, n_random_starts=5, noise=0.1**2, 
                      random_state=arg)
    return [res.x, res.fun]

Serially using the optimisation function test_f, for example 10 times, doesn't cause any problems.

start = time()
res_lst = [test_f(i) for i in range(10)]
print("Time taken = {}".format(time() - start))
print(res_lst)

# Time taken = 15.435882091522217
# [[[-401, 381], -448.9347651723766], [[-370, 512], -560.8594084934316], [[512, 56], -512.1877843974866], [[-440, 348], -499.30784687293044], [[395, 512], -482.90050839518415], [[-450, 345], -441.19106371089254], [[458, -298], -707.6004688693381], [[-496, 28], -510.158816455889], [[-512, -270], -468.84578729934884], [[-474, 388], -886.2502612568253]]

Multiprocessing this on 10 processes (or any number) causes deadlock and the process never ends.

from multiprocessing import Pool
pools = Pool(10)
res_lst = pools.map(test_f, range(10))
pools.close()
pools.join()

Are there any solutions to this? Im using skopt version '0.9.0' and Python '3.9.7'.

Here's the traceback of the deadlock.


KeyboardInterrupt Traceback (most recent call last) /tmp/ipykernel_4876/4227938556.py in 2 3 pools = Pool(10) ----> 4 res_lst = pools.map(test_f, range(10)) 5 pools.close() 6 pools.join()

~/anaconda3/lib/python3.9/multiprocessing/pool.py in map(self, func, iterable, chunksize) 362 in a list that is returned. 363 ''' --> 364 return self._map_async(func, iterable, mapstar, chunksize).get() 365 366 def starmap(self, func, iterable, chunksize=None):

~/anaconda3/lib/python3.9/multiprocessing/pool.py in get(self, timeout) 763 764 def get(self, timeout=None): --> 765 self.wait(timeout) 766 if not self.ready(): 767 raise TimeoutError

~/anaconda3/lib/python3.9/multiprocessing/pool.py in wait(self, timeout) 760 761 def wait(self, timeout=None): --> 762 self._event.wait(timeout) 763 764 def get(self, timeout=None):

~/anaconda3/lib/python3.9/threading.py in wait(self, timeout) 572 signaled = self._flag 573 if not signaled: --> 574 signaled = self._cond.wait(timeout) 575 return signaled 576

~/anaconda3/lib/python3.9/threading.py in wait(self, timeout) 310 try: # restore state no matter what (e.g., KeyboardInterrupt) 311 if timeout is None: --> 312 waiter.acquire() 313 gotit = True 314 else:

KeyboardInterrupt:

bkb3 avatar Mar 24 '22 11:03 bkb3

@bkb3 You may have already figured this out by now, but I get no errors running your code as is.

Also you don't need to do pools.join as your res_1st variable has already done that for you. Maybe try:

pools = mp.Pool(mp.cpu_count())
res_lst = pools.map(test_f, range(10))
pools.close()

I recommend mp.cpu_count() since the issue is likely something going on with the multiprocessing module and not skopt. I have the exact same versions you do and had no issues

Sibajar avatar May 20 '22 21:05 Sibajar