scikit-lego
scikit-lego copied to clipboard
[BUG] 'EstimatorTransformer' object has no attribute 'get_feature_names_out'
When calling get_feature_names_out
on EstimatorTransformer
or a Pipeline
that contains EstimatorTransformer
you will get the following error:
AttributeError: 'EstimatorTransformer' object has no attribute 'get_feature_names_out'
Minimal reproducible example:
from sklego.meta import EstimatorTransformer
from sklearn.linear_model import LinearRegression
EstimatorTransformer(LinearRegression()).get_feature_names_out(None)
I thought this issue was resolved in scikit-learn >= 1.1
(get_feature_names_out Available in all Transformers release highlight), but apparently a manual implementation of get_feature_names_out
is still needed for custom scikit-learn transformers.
Proposed solution sketch:
from sklearn.utils.validation import check_is_fitted
class EstimatorTransformer(TransformerMixin, MetaEstimatorMixin, BaseEstimator):
.
.
.
def fit(X, y, **kwargs):
.
.
.
# Store how many output columns estimator has
self.output_len_ = y.shape[1] if self.multi_output_ else 1
.
.
def get_feature_names_out(self, feature_names_out=None) -> list:
"""
Get names for output of EstimatorTransformer.
Estimator must be fitted first before this function can be called.
"""
check_is_fitted(self.estimator_)
if self.multi_output_:
feature_names = [f"prediction_{i}" for i in range(self.output_len_)]
else:
feature_names = ["prediction"]
return feature_names
Happy to contribute this if you agree with the proposed solution idea. If this a general problem I'm also open to work on implementing get_feature_names_out
for other transformers in scikit-lego
.
Minor ask: you can attach a language to a code-block to get syntax highlighting. Like so:
```python
import pandas as pd
```
That said. Mhm ... I'm wondering what other meta estimators will have the same issue. @CarloLepelaars I did have a quick look at the VotingClassifier in sklearn and it seems that also in sklearn not every Meta estimator has get_feature_names_out
implemented all the time.
I'm also curious if scikit-learn has tests for this behavior that we can copy. @CarloLepelaars did you check the sklearn repo for this by any chance?
Minor ask: you can attach a language to a code-block to get syntax highlighting.
Makes sense! Added syntax highlighting in comment above.
🤔 Interesting! Seems odd that it is implemented for VotingClassifier
, but not for other estimators in the ensemble
module like BaggingClassifier
.
Here is an example of a get_feature_names_out
test case for LDA in sklearn:
https://github.com/scikit-learn/scikit-learn/blob/5bd81234e6e6501ddcddbfdfdc80b90a1302af55/sklearn/tests/test_discriminant_analysis.py#L659
@koaning, shall I go ahead and implement this for EstimatorTransformer
? After that we can evaluate if implementation is needed for other Meta estimators in sklego. I'm sure the implementation for EstimatorTransformer
will give insights on the need for get_feature_names_out
in other Meta estimators.
Yes please!