tpot
tpot copied to clipboard
Training custom sklearn classifier with TPOTClassifier?
I created a custom sklearn classifier and need to run it along with tpot. But while running tpot I get this warning.
Warning: MeanClassifier is not available and will not be used by TPOT.
My full toy code.
from tpot import TPOTClassifier
import numpy as np
from sklearn.base import BaseEstimator, ClassifierMixin
class MeanClassifier(BaseEstimator, ClassifierMixin):
def __init__(self):
pass
def fit(self, X, y=None):
self.treshold_ = (sum(X)/len(X))
return self
def _meaning(self, x):
return( True if x >= self.treshold_ else False )
def predict(self, X, y=None):
try:
getattr(self, "treshold_")
except AttributeError:
raise RuntimeError("You must train classifer before predicting data!")
return([self._meaning(x) for x in X])
def score(self, X, y=None):
# counts number of values bigger than mean
return(sum(self.predict(X)))
tpot_config = {
'MeanClassifier': {
},
}
X_train = [i for i in range(0, 100, 5)]
X_test = [i + 3 for i in range(-5, 95, 5)]
y_train = None
y_test= None
tpot = TPOTClassifier(generations=2, population_size=20, verbosity=2, random_state=42,config_dict=tpot_config)
tpot.fit(X_train, y_train)
print(tpot.score(X_test, y_test))
How can we run a custom sklearn estimator with TPOT? In the web there is not much info available for running custom models with tpot.
Hi @levedev, there are a couple of initial issues I notice:
First, your model is missing a number of methods/attributes that are needed to be compliant with the Scikit-Learn API. A good place to start is here:
https://scikit-learn.org/stable/developers/develop.html#rolling-your-own-estimator
E.g., it is required to define both get_params
and set_params
.
Briefly, you should be able to do the following without the second line returning errors:
>>> from sklearn.utils.estimator_checks import check_estimator
>>> check_estimator(MeanClassifier())
(Also, having y_train
and y_test
equal None
will return errors.)
Second, TPOT needs to be able to import the estimator from a file in order for it to pass the preprocessing checks. Currently, TPOT doesn't support modules defined locally. https://github.com/EpistasisLab/tpot/blob/6448bdb71ba08b4a0447c640d2f05a05e1affc21/tpot/operator_utils.py#L77 ^ This line in particular raises an exception and causes TPOT to skip your custom module.
Thank you for taking your time and giving me the solutions to my problem. I am new to tpot, can you please answer these questions.
- If my custom estimator passes sklearns
check_estimator
checks will it work on tpot ? - Do I need to add a score method in my custom estimator?
- Can we use tpot for an estimator (say logistic regression) which wass made from a different ML module (not sklearn)?