recipes
recipes copied to clipboard
Allow list columns?
Minimal, reproducible example:
test_data <- tibble::tibble(
a = list(mtcars, mtcars),
b = 1:2
)
recipes::recipe(test_data, b ~ .)
Error in model.frame.default(formula, data) :
invalid type (list) for variable 'a'
I understand this might be a weird case, but there are steps in textrecipes that expect a list column, and I've written a step of my own that expects a tibble. I was trying to do some of the setup (which involves a java engine and didn't play nice with parallel) outside of the recipe, and then use that data in a couple different recipes, but recipes won't let me start with a list. I can understand this error if it comes out of the prep as a list, but I'm not done with it yet.
I think that we will eventually support this. I'll leave this open so that we can document thoughts/requirements here.
It turns out that recipes can handle list columns, but only when using vars and roles arguments instead of a formula.
test_data <- tibble::tibble(
a = list(mtcars, mtcars),
b = 1:2
)
recipes::recipe(test_data, vars = c("a", "b"), roles = c("outcome", "predictor"))
Data Recipe
Inputs:
role #variables
outcome 1
predictor 1
The issue with formula originates from function stats::model.frame() used to extract predictors names in function recipes:::get_rhs_vars().
An unexpected side-effect of https://github.com/tidymodels/recipes/pull/1283 means that recipes now support list-columns
library(recipes)
test_data <- tibble::tibble(
a = list(mtcars, mtcars),
b = 1:2
)
recipe(test_data, b ~ .)
#>
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#>
#> ── Inputs
#> Number of variables by role
#> outcome: 1
#> predictor: 1
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.