tpot copied to clipboard
Error using Tpot classifier in google colab that shows "No module named 'sklearn.metrics.scorer'"
Hi all,
I have read
1. 2.
to implement TPOT in google colab.
However, my code get this import error "No module named 'sklearn.metrics.scorer'"
My code
!pip install TPOT
!pip install dask==2.20.0 dask-glm==0.2.0 dask-ml==1.0.0
!pip install tornado==5.0
!pip install distributed==2.2.0
!pip install xgboost==0.90
!pip install fsspec
from dask.distributed import Client
client = Client(processes=False)
import time
from tpot import TPOTClassifier
start = time.time()
# Assign the values outlined to the inputs
number_generations = 4
population_size = 4
offspring_size = 3
scoring_function = 'roc_auc'
# Create the tpot classifier
tpot_clf = TPOTClassifier(generations=number_generations, population_size=population_size,
offspring_size=offspring_size, scoring=scoring_function,
verbosity=2, random_state=0,config_dict='TPOT light', cv=5, warm_start=True,use_dask=True), y)
end = time.time()
print(end - start)
My error
/usr/local/lib/python3.6/dist-packages/sklearn/utils/ DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
return f(*args, **kwargs)
Optimization Progress: 0%
0/16 [00:00<?, ?pipeline/s]
ModuleNotFoundError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tpot/ in _wrapped_cross_val_score(sklearn_pipeline, features, target, cv, scoring_function, sample_weight, groups, use_dask)
424 try:
--> 425 import dask_ml.model_selection # noqa
426 import dask # noqa
12 frames
/usr/local/lib/python3.6/dist-packages/dask_ml/model_selection/ in <module>()
5 """
----> 6 from ._hyperband import HyperbandSearchCV
7 from ._incremental import IncrementalSearchCV
/usr/local/lib/python3.6/dist-packages/dask_ml/model_selection/ in <module>()
---> 11 from ._incremental import BaseIncrementalSearchCV
12 from ._successive_halving import SuccessiveHalvingSearchCV
/usr/local/lib/python3.6/dist-packages/dask_ml/model_selection/ in <module>()
15 from sklearn.base import clone
---> 16 from sklearn.metrics.scorer import check_scoring
17 from sklearn.model_selection import ParameterGrid, ParameterSampler
ModuleNotFoundError: No module named 'sklearn.metrics.scorer'
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tpot/ in fit(self, features, target, sample_weight, groups)
827 per_generation_function=self._check_periodic_pipeline,
--> 828 log_file=self.log_file_,
829 )
/usr/local/lib/python3.6/dist-packages/tpot/ in eaMuPlusLambda(population, toolbox, mu, lambda_, cxpb, mutpb, ngen, pbar, stats, halloffame, verbose, per_generation_function, log_file)
--> 228 population[:] = toolbox.evaluate(population)
/usr/local/lib/python3.6/dist-packages/tpot/ in _evaluate_individuals(self, population, features, target, sample_weight, groups)
1552 for sklearn_pipeline in sklearn_pipeline_list[
-> 1553 chunk_idx : chunk_idx + chunk_size
1554 ]
/usr/local/lib/python3.6/dist-packages/tpot/ in <listcomp>(.0)
1551 )
-> 1552 for sklearn_pipeline in sklearn_pipeline_list[
1553 chunk_idx : chunk_idx + chunk_size
/usr/local/lib/python3.6/dist-packages/stopit/ in wrapper(*args, **kwargs)
144 # ``result`` may not be assigned below in case of timeout
--> 145 result = func(*args, **kwargs)
146 return result
/usr/local/lib/python3.6/dist-packages/tpot/ in _wrapped_cross_val_score(sklearn_pipeline, features, target, cv, scoring_function, sample_weight, groups, use_dask)
429 msg = "'use_dask' requires the optional dask and dask-ml depedencies.\n{}".format(e)
--> 430 raise ImportError(msg)
ImportError: 'use_dask' requires the optional dask and dask-ml depedencies.
No module named 'sklearn.metrics.scorer'
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
<ipython-input-201-63a9e7598fe2> in <module>()
14 verbosity=2, random_state=0,config_dict='TPOT light', cv=5, warm_start=True,use_dask=True)
---> 16, y)
18 print(tpot_clf.fitted_pipeline_)
/usr/local/lib/python3.6/dist-packages/tpot/ in fit(self, features, target, sample_weight, groups)
861 # raise the exception if it's our last attempt
862 if attempt == (attempts - 1):
--> 863 raise e
864 return self
/usr/local/lib/python3.6/dist-packages/tpot/ in fit(self, features, target, sample_weight, groups)
852 self._pbar.close()
--> 854 self._update_top_pipeline()
855 self._summary_of_best_pipeline(features, target)
856 # Delete the temporary cache before exiting
/usr/local/lib/python3.6/dist-packages/tpot/ in _update_top_pipeline(self)
960 # need raise RuntimeError because no pipeline has been optimized
961 raise RuntimeError(
--> 962 "A pipeline has not yet been optimized. Please call fit() first."
963 )
RuntimeError: A pipeline has not yet been optimized. Please call fit() first.
I have also installed the optional dependencies of dask
pip install dask-ml[xgboost] # also install xgboost and dask-xgboost
pip install dask-ml[complete] # install all optional dependencies
But it still returns the same error, please advice. Thanks!
This may be because scikit-learn 0.24 is installed, and the most recent update made some breaking changes to the API that your current install of dask-ml will need to address (see #1176).
To fix this, you could update dask-ml (which seems to fix this issue in later versions), or you could add the line (not recommended)
!pip install 'scikit-learn>=0.22.0,<0.24.0' --force-reinstall
after your other pip installs to force an install of an older version of scikit-learn that doesn't have these changes.
EDIT: A previous version of this comment mistakenly attributed the error to TPOT - upon closer reading of this, this is actually an issue with a dask-ml function - you should check your dask-ml version to see if there is an updated one.
Having same issue. tried force-reinstall, installed dask per website. down grading the scikit-learn package just causes errors about deprecated module. bit stuck on this. Using Anaconda on local pc with Jupyter notebook.