Jérôme Dockès
Jérôme Dockès
One question I have is whether the TableVectorizer should output a dataframe or a numpy array (I guess the name indicates the former). If we consider the TableVectorizer is the...
completed in three PRs : #888 , #895 and #902
Thank you for reporting this problem! I can reproduce it. The TableVectorizer is undergoing some refactoring and improvement in #848 , we'll make sure to add a test to ensure...
here is a minimal reproducer, to add as a test: ```python import pandas as pd from skrub import TableVectorizer df = pd.DataFrame(dict(a=pd.Series(['0', '1'], dtype='category'))) TableVectorizer().fit(df).transform(df) ```
thanks a lot for reporting this! We'll make sure to address it in #877
here is a reproducer, to be added to our test suite: ```python import pandas as pd from skrub import TableVectorizer from sklearn.pipeline import make_pipeline df = pd.DataFrame(dict(a=[1.1, 2.2])) tv =...
@koaning part of the changes Gaël mentions is being worked on in [#877](https://github.com/skrub-data/skrub/pull/877) Feel free to provide feedback and advice!
shrinking towards the overall aggregate across groups sounds like a useful option to add
I guess it only applies for some of the aggregation operations