recipes icon indicating copy to clipboard operation
recipes copied to clipboard

Feature: Extracting names of variable input and output

Open EmilHvitfeldt opened this issue 2 years ago • 1 comments
trafficstars

As the title says. I think it would be helpful to have some helper functions that would extract what variables got selected by a step and which variables were returned.

In the following example step_dummy() takes cyl and returns cyl_X6 and cyl_X8, and step_dummy() takes mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb, cyl_X6, cyl_X8, and returns mpg, disp, hp, drat, wt, qsec, vs, am, gear, carb, cyl_X6, cyl_X8.

I feel like this information should be in the hands of the user, outside of being printing, and should help debugging and understanding.

library(recipes)

mtcars$zv <- 0
mtcars$cyl <- as.character(mtcars$cyl)

rec_spec <- recipe(~., data = mtcars) |>
  step_dummy(all_nominal_predictors()) |>
  step_zv(all_predictors())

prep(rec_spec)
#> 
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#> 
#> ── Inputs
#> Number of variables by role
#> predictor: 12
#> 
#> ── Training information
#> Training data contained 32 data points and no incomplete rows.
#> 
#> ── Operations
#> • Dummy variables from: cyl | Trained
#> • Zero variance filter removed: zv | Trained

EmilHvitfeldt avatar Apr 28 '23 16:04 EmilHvitfeldt

scikit-learn uses feature_names_in and features_names_out

EmilHvitfeldt avatar May 15 '23 22:05 EmilHvitfeldt

closing in favor of https://github.com/tidymodels/recipes/issues/1158

EmilHvitfeldt avatar May 26 '24 03:05 EmilHvitfeldt

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

github-actions[bot] avatar Jun 10 '24 00:06 github-actions[bot]