oneTBB
oneTBB copied to clipboard
TBB4PY patching not working with python joblib nested parallelism problem
TBB patch not working with below python joblib nested parallelism example. used intel synthetic dataset. code gets stuck with threading joblib backend. Kindly advise.
from sklearn.cluster import KMeans
import time
import numpy as np
import joblib
from joblib import Parallel, delayed, parallel_backend
path="../dask_study/Intel_Bench_Datasets/Intel_Bench_Datasets/datasets/"
X = np.load(path+'kmeans/X_blob_10000000x5_50.npy')
y = np.load(path+'kmeans/y_blob_10000000x5_50.npy')
def kmeans(x):
#kmeans=KMeans(n_clusters=5, random_state=123,copy_x= False)
kmeans=KMeans(n_clusters=5, random_state=123)
kmeans.fit(x)
return kmeans
#X_copy=np.copy(X)
data_chunks=np.array_split(X, 16)
st = time.time()
with joblib.parallel_backend('threading'):
results=Parallel(n_jobs=-1)(delayed(kmeans)(chunk) for chunk in data_chunks)
#KMeans(n_clusters=5, random_state=123).fit(X,y)
print(f"Time Taken to run Kmeans: {time.time()-st}")
@goplanid, which command are you using for TBB4PY patching? Have you tried disabling inter-process coordination by not specifying the --ipc
option?
I am just using -m tbb. Not using --ipc.
Thank you. I could not reproduce the hang using the current master
with a random dataset.
Could you please share your X_blob_10000000x5_50.npy
and y_blob_10000000x5_50.npy
?
@goplanid is this issue still relevant?
If anyone encounter this issue in the future please open new issue with a link to this one