tpot icon indicating copy to clipboard operation
tpot copied to clipboard

n_jobs Not Stopping CPU from Running at 100% on All Threads?

Open windowshopr opened this issue 2 years ago • 2 comments

[provide general introduction to the issue and why it is relevant to this repository]

Just performed a fresh install of TPOT (Windows 10, Python 3.7.9) using the following commands in succession (note the 'torch' now, not 'py-torch'):

pip install numpy scipy scikit-learn pandas joblib torch
pip install deap update_checker tqdm stopit xgboost
pip install dask[delayed] dask[dataframe] dask-ml fsspec>=0.3.3 distributed>=2.10.0
pip install tpot

...and I'm having an issue with Dask, specifically, even with setting n_jobs=2, 100% of all 12 of my cpu's threads are engaged when I start the classifier. Doesn't seem right? My dataset is quite large at 27,270 rows, and 599 columns, so I shrunk it down to 5,000 rows and kept all columns, and still it's pinnin' my CPU pretty hard. Why would this be?

image

Thanks!

windowshopr avatar Aug 05 '21 02:08 windowshopr

I'm thinking it could be because my dataset is too big, even at 5,000 rows. I shrunk it down to just a couple hundred rows and it's much quieter. Maybe once it loads all copies of the dataset into memory it would settle down?? Population size seems to play a factor as well...

windowshopr avatar Aug 05 '21 03:08 windowshopr

when use_dask is set to True, the n_jobs argument is actually not used. This should probably be fixed. When using dask it just uses dask.compute without setting the number of threads (here). My understanding is that this defaults to using all the threads available.

To fix this, you need to have a dask wrapper around your .fit() command. For example:

import dask
from dask.distributed import Client, progress, LocalCluster

with LocalCluster(threads_per_worker=threads_per_worker, n_workers=n_workers, processes=processes,memory_limit=memory_limit) as cluster:
        with Client(cluster) as client:
            with dask.distributed.performance_report(filename=Dask_folder): #optional
                  est = tpot.TPOTClassifier()
                  est.fit(data)

On a single machine, I think it is best to use processes=False, n_workers=1, and threads_per_worker set to the number of cores/threads you want to use. (It should probably be benchmarked whether to have one worker and many threads or many workers with one thread, not 100% sure which option performs better).

perib avatar May 17 '22 22:05 perib