Riccardo Cappuzzo
Riccardo Cappuzzo
As reported in #1424, a lot of docstrings have extremely long preambles that make it hard to understand what the function/object is doing. This PR addresses that issue by shortening...
At the moment, - MinHashEncoder - GapEncoder - StringEncoder - TextEncoder - DatetimeEncoder all have the note on single column transformers at the very top of the docstring.  I...
Currently, `skrub.selectors.Filter`, `skrub.selectors.NameFilter`,`skrub.selectors.Selector` are public and in the documentation are shown with an empty docstring:  This was likely unintended, so they should be hidden from the docs.
### Describe the issue linked to the documentation I am working on an example with some of the datasets provided in skrub.datasets, and for each of them I need to...
Adding an example of how to perform timeseries forecasting with lagged features using expressions
First version, needs editing cleanup and double checking the section on lagged features.
At the moment, each dataset has its own version of the documentation. Some have info about the dataset, some have a description of the Bunch object. In general, it's messy...
Right now, the `ToDatetime` transformer tries to convert strings to datetimes by either guessing the format using pandas' timeseries parsing library, or it uses a format provided by the user....
Tests in `test_table_vectorizer.py` are all using example pandas only dataframes, rather than using `df_module` and testing pandas, pandas nullable types, and polars We should update all the tests to address...
I think the skrub.datasets utilities for loading and returning datasets could be improved a bit for for a better user experience: - skrub supports both pandas and polars, but datasets...