auto-sklearn
auto-sklearn copied to clipboard
ThirdPartyComponents.add_component check for explicit base class results in redundant inheritance
Auto-sklearn is awesome, I'm really enjoying this great library!
Currently, this line here https://github.com/automl/auto-sklearn/blob/5c69ddf4584c5c7c4977203a1a579d042c6e3716/autosklearn/pipeline/components/base.py#L46
explicitly checks the .__bases__ attribute of the 3rd party component you are adding to make sure it inherits directly from e.g. AutoSklearnPreprocessingAlgorithm, rather than checking if it's an instance of it with isinstance(obj, self.base_class).
This results in errors like in https://github.com/automl/auto-sklearn/issues/1268#issuecomment-944974324.
This makes it awkward to extend the built in preprocessors. In particular, I'm going to try a handful of different data preprocessor strategies that share a lot of code, so it makes a lot of sense to have some inheritance structure for these. However, only the parent at the base of the hierarchy (e.g. MyAbstractCustomPreprocessingAlorithm) will extend directly from AutoSklearnPreprocessingAlgorithm and contains all the common code. The concrete classes will all extend MyAbstractCustomPreprocessingAlorithm, so AutoSklearnPreprocessingAlgorithm will not be in their .__bases__ attribute, but they do indeed implement the preprocessing interface.
The fix is to subclass AutoSklearnPreprocessingAlgorithm in all of the concrete classes as well, but that shouldn't be necessary and feels kind of weird to do.
Is there a reason why we must check for an explicit base class?
Hi @AmirAlavi,
I have no idea why we have this, my guess it's some relic of Python 2 code but I wouldn't know for sure. I would rather do as you say and do isinstance(obj, cls), thanks for pointing it out.
Best, Eddie