auto-sklearn icon indicating copy to clipboard operation
auto-sklearn copied to clipboard

ThirdPartyComponents.add_component check for explicit base class results in redundant inheritance

Open AmirAlavi opened this issue 3 years ago • 1 comments

Auto-sklearn is awesome, I'm really enjoying this great library!

Currently, this line here https://github.com/automl/auto-sklearn/blob/5c69ddf4584c5c7c4977203a1a579d042c6e3716/autosklearn/pipeline/components/base.py#L46

explicitly checks the .__bases__ attribute of the 3rd party component you are adding to make sure it inherits directly from e.g. AutoSklearnPreprocessingAlgorithm, rather than checking if it's an instance of it with isinstance(obj, self.base_class).

This results in errors like in https://github.com/automl/auto-sklearn/issues/1268#issuecomment-944974324.

This makes it awkward to extend the built in preprocessors. In particular, I'm going to try a handful of different data preprocessor strategies that share a lot of code, so it makes a lot of sense to have some inheritance structure for these. However, only the parent at the base of the hierarchy (e.g. MyAbstractCustomPreprocessingAlorithm) will extend directly from AutoSklearnPreprocessingAlgorithm and contains all the common code. The concrete classes will all extend MyAbstractCustomPreprocessingAlorithm, so AutoSklearnPreprocessingAlgorithm will not be in their .__bases__ attribute, but they do indeed implement the preprocessing interface.

The fix is to subclass AutoSklearnPreprocessingAlgorithm in all of the concrete classes as well, but that shouldn't be necessary and feels kind of weird to do.

Is there a reason why we must check for an explicit base class?

AmirAlavi avatar Nov 06 '22 21:11 AmirAlavi

Hi @AmirAlavi,

I have no idea why we have this, my guess it's some relic of Python 2 code but I wouldn't know for sure. I would rather do as you say and do isinstance(obj, cls), thanks for pointing it out.

Best, Eddie

eddiebergman avatar Nov 07 '22 08:11 eddiebergman