recipes
recipes copied to clipboard
Feature: Extracting names of variable input and output
As the title says. I think it would be helpful to have some helper functions that would extract what variables got selected by a step and which variables were returned.
In the following example step_dummy() takes cyl and returns cyl_X6 and cyl_X8, and step_dummy() takes mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb, cyl_X6, cyl_X8, and returns mpg, disp, hp, drat, wt, qsec, vs, am, gear, carb, cyl_X6, cyl_X8.
I feel like this information should be in the hands of the user, outside of being printing, and should help debugging and understanding.
library(recipes)
mtcars$zv <- 0
mtcars$cyl <- as.character(mtcars$cyl)
rec_spec <- recipe(~., data = mtcars) |>
step_dummy(all_nominal_predictors()) |>
step_zv(all_predictors())
prep(rec_spec)
#>
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#>
#> ── Inputs
#> Number of variables by role
#> predictor: 12
#>
#> ── Training information
#> Training data contained 32 data points and no incomplete rows.
#>
#> ── Operations
#> • Dummy variables from: cyl | Trained
#> • Zero variance filter removed: zv | Trained
scikit-learn uses feature_names_in and features_names_out
closing in favor of https://github.com/tidymodels/recipes/issues/1158
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.