Jérôme Dockès
Jérôme Dockès
this feature was also requested in https://github.com/skrub-data/skrub/issues/1127#issue-2615979710
The splines + circular encoding was added in #1235 , and the holidays is already tracked by #710 so I think we can close this one
With the upcoming "Recipe" (or "PipeBuilder" or whatever its name will be), it will be easy to apply a transformation to only some columns. For example you would be able...
Having a conditional transformer might be useful when something more general than selecting columns is needed though, such as "apply a PCA if there are more than 200 columns"
However, if the important part is not really the name "date" but rather applying the spline transformer to datetime columns only, you might already be able to use the TableVectorizer's...
for selecting all datetime columns you could use the `skrub.selectors.any_date()` selector -- I just need to update the PipeBuilder branch with the current state of PR #902 and I'll show...
> Do we want to assume that the user ran their dataframe code or do we want our library to infer that on their behalf? I am partially asking because...
If you wanted to manually control your pipeline you could do something like: ```python import pandas as pd import numpy as np from skrub import ToDatetime from skrub import selectors...
`allow_reject` means "let the `ToDatetime` transformer decide if it should be applied to the column or not (and reject those that don't look like dates). (by default it is false)
But if you want something completely automatic, eg that you are running on many datasets that you don't inspect manually, then you're probably better off using the TableVectorizer and let...