sidneyzhu

Results 4 comments of sidneyzhu

i encountered the similar issues, my raw dataframe has 1k ids, 27k rows, 140 features. it can be well done by full feature extraction with MultiprocessingDistributor(n_workers=12) on a 64GB machine...

@AlessioBolpagni98 have you fixed this issue?

thanks for your reply. in my case , the code can be finished by multiprocess of n_jobs=8 in about 30mins, but it can't be finished in clustered 8 workers on...

i fixed it by shifting to dask_feature_extraction_on_chunk(), the ClusterDaskDistributor still failed with a lot of communication errors