scikit-lego
scikit-lego copied to clipboard
[BUG] DecayEstimator should be either a classifier or regressor
Hi again!
When using the DecayEstimator
inside the ZeroInflatedRegressor
an error occurred, because DecayEstimator is neither a regressor nor a classifier.
from sklego.meta import DecayEstimator
from sklearn.base import is_regressor, is_classifier
from sklearn.linear_model import LogisticRegression, LinearRegression
is_regressor(DecayEstimator(LinearRegression())), is_classifier(DecayEstimator(LogisticRegression()))
# (False, False)
Maybe the DecayEstimator
can just mirror the behavior of the estimator we passed in somehow.
Ah yeah, interesting one. Technically we only know at runtime if the DecayEstimator
is a regressor or a classifier. We could try to make the decay estimator both a classifier and a regressor, but I wonder if there are consequences to doing that.
I don't know if there's a pretty way to attach a mixin to an object at runtime though 😅
Maybe a pragmatic observation, when is this an issue in practice? I agree it's an inconsistency, but I wonder when a user would really notice?
That's true. I just noticed it when I wanted to put sample weights on the regressor, but not on the classifier.
I'm using it for a time series regression problem for some sales: the regressor should be especially accurate on newer data. The classifier should pick up days (often sundays) where no sales are made, but I wouldn't enforce any sample weights here. But maybe this asymmetric approach is also not good, not sure.
I tried to use DecayEstimator
with MultiOutputClassifier
and when I fit the model following error shows up:
AttributeError: 'DecayEstimator' object has no attribute 'classes_'
After investigation for a while, I saw that the check if self._is_classifier()
is not returning True. Likely due the issue presented here. The model is an ExtraTreesClassifier
, when I run [p.__name__ for p in type(decay.model).__bases__]
I get:
['ForestClassifier']
And not any "ClassifierMixin".
Hm we probably need to implement a classes_ attribute that patches through to the base estimator, right?
On Fri, 18 Jun 2021 at 19:35, Victor Oliveira @.***> wrote:
I tried to use DecayEstimator with MultiOutputClassifier and when I fit the model following error shows up:
AttributeError: 'DecayEstimator' object has no attribute 'classes_'
After investigation for a while, I saw that the check if self._is_classifier() is not returning True. Likely due the issue presented here. The model is an ExtraTreesClassifier, when I run [p.name for p in type(decay.model).bases] I get:
['ForestClassifier']
And not any "ClassifierMixin".
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/koaning/scikit-lego/issues/463#issuecomment-864184315, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2ZET4ZEAM6WJV3EMZCBC3TTN7XNANCNFSM43ZMAC4Q .
Not necessarily, I just used is_classifier
function from sklearn.base
, in this way:
def _is_classifier(self):
return is_classifier(self.model)
(Actually, we could use is_classifier function directly rs)
I can send the PR, but it is just a simple adjustment, not sure how to proceed.