recipes
recipes copied to clipboard
Feature Request: step_ function for matrix indexing
Feature
This came up last week during the tidymodels workshop and Max suggest that I open an issue.
Sometimes a dataset contains a mix of qualitative character variables and dummy encoded variables. If we need to homogenize the data, a step_ function for this may be useful. Something that uses tidyselect for var selection and takes the name of the feature being described.
For example, going from this:
| species | arboreal | terrestrial |
|---|---|---|
| sp a | 0 | 1 |
| sp b | 1 | 0 |
| sp c | 1 | 0 |
to this:
| species | locomotion |
|---|---|
| sp a | terrestrial |
| sp b | arboreal |
| sp c | arboreal |
There are many ways to implement this, I have a silly write up here but a base approach would be better.
- if this already exists and I missed it because of unfamiliarity with ML terms please disregard
I like it, it is basically a reverse step_dummy()