h2o4gpu icon indicating copy to clipboard operation
h2o4gpu copied to clipboard

GridSearchCV not working with h2o4gpu.RandomForestClassifier()

Open Nezafati opened this issue 7 years ago • 3 comments

Please do not use issues to ask questions. For questions use either StackOverflow (h2o4gpu tag) or Gitter.

GitHub issues are used for:

  • Bug reporting.
  • Feature/enhancement requests and discussion.

Before submitting a new issue please make sure a similar issue is not already present in the tracker.

Please make sure to fill out the form below as best as you can. This will help both you and us.


Environment (for bugs)

  • OS platform, distribution and version: Linux Ubuntu 16.04
  • Installed from (source or binary): binary
  • Version: h2o4gpu 0.2.0
  • Python version (optional): 3.6
  • CUDA/cuDNN version: cuda-9.0
  • GPU model (optional): Geforce Titan (Maxwell)
  • CPU model: Intel Xeon
  • RAM available: 16GB

Description

Error message when I try .fit() method for sklearn.model_selection.GridSearch():

Code to reproduce:

import h2o4gpu from sklearn.datasets import load_breast_cancer from sklearn.model_selection import GridSearchCV

dat = load_breast_cancer() X = dat.data y = dat.target

clf_gpu = h2o4gpu.RandomForestClassifier() parameters_rf = {'n_estimators': [100, 200]} search_gpu = GridSearchCV(estimator=clf_gpu, param_grid=parameters_rf, cv=10) search_gpu.fit(X,y)


TypeError Traceback (most recent call last) in () 2 parameters_rf = {'n_estimators': [100, 200]} 3 search_gpu = GridSearchCV(estimator=clf_gpu, param_grid=parameters_rf, cv=10) ----> 4 search_gpu.fit(X,y)

~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params) 623 n_candidates * n_splits)) 624 --> 625 base_estimator = clone(self.estimator) 626 pre_dispatch = self.pre_dispatch 627

~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/base.py in clone(estimator, safe) 58 % (repr(estimator), type(estimator))) 59 klass = estimator.class ---> 60 new_object_params = estimator.get_params(deep=False) 61 for name, param in six.iteritems(new_object_params): 62 new_object_params[name] = clone(param, safe=False)

TypeError: get_params() got an unexpected keyword argument 'deep'

Nezafati avatar Jul 27 '18 20:07 Nezafati

@Nezafati thanks for submitting the issue! Looks like our RandomForestClassifier isn't implementing fully the http://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html interface. Will look into it (looks like more of our estimators have the same issue).

You can try writing a wrapper for the time being, something like (untested):

import h2o4gpu

class RandomForestClassifierWrapper(h2o4gpu.RandomForestClassifier):
    def get_params(self, deep=True):
        return super().get_params()

And try using that wrapper instead.

mdymczyk avatar Jul 30 '18 01:07 mdymczyk

Hi,

I am suffering from the same issue, using GridSearchCV with h2o4gpu.Ridge().

I wrote the following wrapper to the top of my code:

class RidgeWrapper(h2o4gpu.Ridge):               
    def get_params(self,deep=True):
        return super().get_params()

and initialize my model using:

    premodel=h2o4gpu.Ridge(fit_intercept=False,tol=0.0001)
    model=RidgeWrapper(premodel)

which results with the same error. Can you please help me with a temporary solution for this issue?

o-d-t avatar Oct 26 '18 14:10 o-d-t

I was able to get it to work using the suggested wrapper for random forest like this:

class RandomForestClassifierWrapper(h2o4gpu.RandomForestClassifier):
    def get_params(self, deep=True):
        return super().get_params()

rf = RandomForestClassifierWrapper(class_weight=train_proportions)
rf_random = RandomizedSearchCV(estimator = rf, param_distributions = random_grid, n_iter = 100, cv = 3, n_jobs = 1)

Thank you!

kellinpelrine avatar Dec 06 '19 02:12 kellinpelrine