recipes icon indicating copy to clipboard operation
recipes copied to clipboard

Feature Request: step_ function for matrix indexing

Open luisDVA opened this issue 3 years ago • 1 comments

Feature

This came up last week during the tidymodels workshop and Max suggest that I open an issue.

Sometimes a dataset contains a mix of qualitative character variables and dummy encoded variables. If we need to homogenize the data, a step_ function for this may be useful. Something that uses tidyselect for var selection and takes the name of the feature being described.

For example, going from this:

species arboreal terrestrial
sp a 0 1
sp b 1 0
sp c 1 0

to this:

species locomotion
sp a terrestrial
sp b arboreal
sp c arboreal

There are many ways to implement this, I have a silly write up here but a base approach would be better.

  • if this already exists and I missed it because of unfamiliarity with ML terms please disregard

luisDVA avatar Aug 01 '22 21:08 luisDVA

I like it, it is basically a reverse step_dummy()

EmilHvitfeldt avatar Aug 03 '22 09:08 EmilHvitfeldt