scikit-lego icon indicating copy to clipboard operation
scikit-lego copied to clipboard

[BUG] DecayEstimator should be either a classifier or regressor

Open Garve opened this issue 3 years ago • 6 comments

Hi again!

When using the DecayEstimator inside the ZeroInflatedRegressor an error occurred, because DecayEstimator is neither a regressor nor a classifier.

from sklego.meta import DecayEstimator
from sklearn.base import is_regressor, is_classifier
from sklearn.linear_model import LogisticRegression, LinearRegression

is_regressor(DecayEstimator(LinearRegression())), is_classifier(DecayEstimator(LogisticRegression()))

# (False, False)

Maybe the DecayEstimator can just mirror the behavior of the estimator we passed in somehow.

Garve avatar Apr 29 '21 08:04 Garve

Ah yeah, interesting one. Technically we only know at runtime if the DecayEstimator is a regressor or a classifier. We could try to make the decay estimator both a classifier and a regressor, but I wonder if there are consequences to doing that.

I don't know if there's a pretty way to attach a mixin to an object at runtime though 😅

koaning avatar Apr 29 '21 08:04 koaning

Maybe a pragmatic observation, when is this an issue in practice? I agree it's an inconsistency, but I wonder when a user would really notice?

koaning avatar Apr 29 '21 08:04 koaning

That's true. I just noticed it when I wanted to put sample weights on the regressor, but not on the classifier.

I'm using it for a time series regression problem for some sales: the regressor should be especially accurate on newer data. The classifier should pick up days (often sundays) where no sales are made, but I wouldn't enforce any sample weights here. But maybe this asymmetric approach is also not good, not sure.

Garve avatar Apr 29 '21 08:04 Garve

I tried to use DecayEstimator with MultiOutputClassifier and when I fit the model following error shows up:

AttributeError: 'DecayEstimator' object has no attribute 'classes_'

After investigation for a while, I saw that the check if self._is_classifier() is not returning True. Likely due the issue presented here. The model is an ExtraTreesClassifier, when I run [p.__name__ for p in type(decay.model).__bases__] I get:

['ForestClassifier']

And not any "ClassifierMixin".

vtoliveira avatar Jun 18 '21 17:06 vtoliveira

Hm we probably need to implement a classes_ attribute that patches through to the base estimator, right?

On Fri, 18 Jun 2021 at 19:35, Victor Oliveira @.***> wrote:

I tried to use DecayEstimator with MultiOutputClassifier and when I fit the model following error shows up:

AttributeError: 'DecayEstimator' object has no attribute 'classes_'

After investigation for a while, I saw that the check if self._is_classifier() is not returning True. Likely due the issue presented here. The model is an ExtraTreesClassifier, when I run [p.name for p in type(decay.model).bases] I get:

['ForestClassifier']

And not any "ClassifierMixin".

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/koaning/scikit-lego/issues/463#issuecomment-864184315, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2ZET4ZEAM6WJV3EMZCBC3TTN7XNANCNFSM43ZMAC4Q .

MBrouns avatar Jun 18 '21 17:06 MBrouns

Not necessarily, I just used is_classifier function from sklearn.base, in this way:

def _is_classifier(self):
    return is_classifier(self.model)

(Actually, we could use is_classifier function directly rs)

I can send the PR, but it is just a simple adjustment, not sure how to proceed.

vtoliveira avatar Jun 18 '21 17:06 vtoliveira