Results 396 comments of Jérôme Dockès

One question I have is whether the TableVectorizer should output a dataframe or a numpy array (I guess the name indicates the former). If we consider the TableVectorizer is the...

completed in three PRs : #888 , #895 and #902

Thank you for reporting this problem! I can reproduce it. The TableVectorizer is undergoing some refactoring and improvement in #848 , we'll make sure to add a test to ensure...

here is a minimal reproducer, to add as a test: ```python import pandas as pd from skrub import TableVectorizer df = pd.DataFrame(dict(a=pd.Series(['0', '1'], dtype='category'))) TableVectorizer().fit(df).transform(df) ```

thanks a lot for reporting this! We'll make sure to address it in #877

here is a reproducer, to be added to our test suite: ```python import pandas as pd from skrub import TableVectorizer from sklearn.pipeline import make_pipeline df = pd.DataFrame(dict(a=[1.1, 2.2])) tv =...

@koaning part of the changes Gaël mentions is being worked on in [#877](https://github.com/skrub-data/skrub/pull/877) Feel free to provide feedback and advice!

shrinking towards the overall aggregate across groups sounds like a useful option to add

I guess it only applies for some of the aggregation operations