Jérémie du Boisberranger comments

Results 119 comments of


                                            Jérémie du Boisberranger

Preserving dtype for float32 / float64 in transformers

When making a transformer preserve float32, it's important to check that it does not introduce performance regression. Currently, transformers that don't preserve float32 convert to float64 as a first step...

Preserving dtype for float32 / float64 in transformers

~~@svenstehle Thanks for looking into that. Would you like to open a PR ?~~ EDIT: there's nothing to actually besides removing it from the checklist. Thanks

Preserving dtype for float32 / float64 in transformers

@betatim not yet: this test only runs on float64 by default. To trigger the test on both float64 and float32, you need to set the `"preserve_dtype": [np.float64, np.float32]` estimator tag.

ENH Implement get_feature_names_out for FeatureUnion in case of "passthrough"

Thanks for the PR @diederikwp. Mostly looks good, however it still does not work if we don't pass the feature names to `get_feature_names_out`: ```py import pandas as pd from sklearn.pipeline...

ENH Implement get_feature_names_out for FeatureUnion in case of "passthrough"

https://github.com/scikit-learn/scikit-learn/pull/23993 has been merged which automatically fixes the issue reported here https://github.com/scikit-learn/scikit-learn/pull/24058#issuecomment-1238576406. I just added a test for that. @thomasjpfan is it still good for you ?

[MRG] Fix Erroneous Avg. Loss calculation

Thanks for the PR @Harsh14901. If I'm not mistaken ``train_count = n_samples - validation_mask.sum()``. It would be more efficient to compute it once before entering the outer loop.

ENH Add dtype preservation to FeatureAgglomeration

Thanks for the PR @IvanLauLinTiong ! looks good. I just removed one of the tests because it will be covered by a common test.

Yeo-Johnson Power Transformer gives Numpy warning

~~maybe we could make a "safe variance" function by using the fact that ``V(x) = xmax**2 V(x / xmax)`` ?~~ EDIT: nvm it will probably not be more stable...

TST Work-around for Ubuntu atlas float32 failure

We already observed numerical instabilities when reviewing https://github.com/scikit-learn/scikit-learn/pull/22806 The following was added to try to mitigate them. Looks like it was not good enough, or does it come from another...

FEA add PredictionErrorDisplay

> Could we flit x- and y-axis such that predicted values are on the x-axis and observed values on the y-axis? This way, the x-axis would be the same as...