recipes icon indicating copy to clipboard operation
recipes copied to clipboard

Pipeable steps for feature engineering and data preprocessing to prepare for modeling

Results 124 recipes issues
Sort by recently updated
recently updated
newest added

## Feature This came up last week during the tidymodels workshop and Max suggest that I open an issue. Sometimes a dataset contains a mix of qualitative character variables and...

new steps
feature

For steps that create derived variables, such as from a date, the new variables are prefixed with the date column name: ``` library(tidymodels) d % step_holiday(date,holidays = c("LaborDay","ChristmasDay")) rec %>%...

feature
discussion

Many of the themis steps needs to know which variables are predictors, steps like `step_lencode_glm()` needs to know what the outcome is. There should be a function here in {recipes}...

feature

This is the first PR that begins extending `recipes` into remote tables. The idea is to setup the infrastructure in the main functions, as well as to centralize support for...

Now that {multilevelmod} is on CRAN, it would be great if we can have one or more recipe steps that compute group-meaned and de-meaned variables like the new {datawizard} functions...

feature

We have a good bit of documentation about skipping vs. not skipping: - https://www.tmwr.org/recipes.html#skip-equals-true - https://recipes.tidymodels.org/articles/Skipping.html - the individual function pages, etc However, people continue to have a hard time...

feature

Is it possible for derived variables to have multiple roles? This doesn't work: ``` recipe(HHV ~ ., data = biomass) %>% step_mutate(carbon_sqr = carbon ^ 2, role = "new") %>%...

bug

Similar to `step_pca()` or `step_umap()` where it may be useful for dimensional reduction, `step_mca()` may seem useful for reducing dimensions for categorical predictors.

feature

Hi, Many `{recipes}` steps modify-in-place, so that the original column is modified in some way. A subset of steps will (*possibly*) add or remove columns (e.g. `step_pca()`, `step_ns()`, `step_dummy()`, etc.)....

long term
feature

## Feature In situations when time series data is not continuous, e.g. in bicycle bike sharing competition by Kaggle https://www.kaggle.com/c/bike-sharing-demand/ it would be useful to have the possibility to prevent...

reprex