`unnest_wider` fails on certain types
I've been struggling to unnest_wider certain results in a list column. I don't know why some types (tibbles) are correctly unnested into list columns, but others (lm) are not. Any advice? Or is this a bug?
For the third example here, I expect unnest_wider should create a list column of lm objects.
library(tidyverse)
packageVersion("tidyr")
#> [1] '1.3.0'
# works
x <- list(a = 1, b = "b")
tibble(y = list(x, x)) |> unnest_wider(y)
#> # A tibble: 2 × 2
#> a b
#> <dbl> <chr>
#> 1 1 b
#> 2 1 b
# works
x <- list(a = 1, b = "b", c = tibble(z = 1:3))
tibble(y = list(x, x)) |> unnest_wider(y)
#> # A tibble: 2 × 3
#> a b c
#> <dbl> <chr> <list>
#> 1 1 b <tibble [3 × 1]>
#> 2 1 b <tibble [3 × 1]>
# doesn't work
x <- list(a = 1, b = "b", c = lm(mpg ~ wt, mtcars))
tibble(y = list(x, x)) |> unnest_wider(y)
#> Error in `unnest_wider()`:
#> ℹ In column: `y`.
#> ℹ In row: 1.
#> Caused by error in `list_sizes()`:
#> ! `x$c` must be a vector, not a <lm> object.
#> Backtrace:
#> ▆
#> 1. ├─tidyr::unnest_wider(tibble(y = list(x, x)), y)
#> 2. │ └─tidyr:::col_to_wide(...)
#> 3. │ ├─tidyr:::with_indexed_errors(...)
#> 4. │ │ └─rlang::try_fetch(...)
#> 5. │ │ └─base::withCallingHandlers(...)
#> 6. │ └─purrr::map(...)
#> 7. │ └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#> 8. │ ├─purrr:::with_indexed_errors(...)
#> 9. │ │ └─base::withCallingHandlers(...)
#> 10. │ ├─purrr:::call_with_cleanup(...)
#> 11. │ └─tidyr (local) .f(.x[[i]], ...)
#> 12. │ └─tidyr:::elt_to_wide(...)
#> 13. │ └─vctrs::list_sizes(x)
#> 14. └─vctrs:::stop_scalar_type(`<fn>`(`<lm>`), "x$c", `<env>`)
#> 15. └─vctrs:::stop_vctrs(...)
#> 16. └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)
Created on 2023-06-26 with reprex v2.0.2
I get the same issue, when trying to convert the result of a purrr::safely() column to two columns:
library(tidyverse)
# Sample list with problematic function
data <- tibble(
input = list(
a = 1:3,
b = NULL,
c = "abc"
)
)
# Apply safely function
data <- data |>
mutate(
output = map(
input,
safely(sum)
)
)
str(data$output)
#> List of 3
#> $ a:List of 2
#> ..$ result: int 6
#> ..$ error : NULL
#> $ b:List of 2
#> ..$ result: int 0
#> ..$ error : NULL
#> $ c:List of 2
#> ..$ result: NULL
#> ..$ error :List of 2
#> .. ..$ message: chr "invalid 'type' (character) of argument"
#> .. ..$ call : language .Primitive("sum")(..., na.rm = na.rm)
#> .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"
# Unnest the result
data = data |>
unnest_wider(col = output)
#> Error in `unnest_wider()`:
#> ℹ In column: `output`.
#> ℹ In row: 3.
#> Caused by error in `list_sizes()`:
#> ! `x$error` must be a vector, not a <simpleError/error/condition> object.
#> Backtrace:
#> ▆
#> 1. ├─tidyr::unnest_wider(data, col = output)
#> 2. │ └─tidyr:::col_to_wide(...)
#> 3. │ ├─tidyr:::with_indexed_errors(...)
#> 4. │ │ └─rlang::try_fetch(...)
#> 5. │ │ └─base::withCallingHandlers(...)
#> 6. │ └─purrr::map(...)
#> 7. │ └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#> 8. │ ├─purrr:::with_indexed_errors(...)
#> 9. │ │ └─base::withCallingHandlers(...)
#> 10. │ ├─purrr:::call_with_cleanup(...)
#> 11. │ └─tidyr (local) .f(.x[[i]], ...)
#> 12. │ └─tidyr:::elt_to_wide(...)
#> 13. │ └─vctrs::list_sizes(x)
#> 14. └─vctrs:::stop_scalar_type(`<fn>`(`<smplErrr>`), "x$error", `<env>`)
#> 15. └─vctrs:::stop_vctrs(...)
#> 16. └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)
Created on 2023-08-21 with reprex v2.0.2
would be great if it would work without this detour:
# Unnest the result
data |>
mutate(
output = map_depth(output, 2, as.character)
) |>
unnest_wider(col = output)
#> # A tibble: 3 × 3
#> input result error
#> <named list> <chr> <chr>
#> 1 <int [3]> 6 <NA>
#> 2 <NULL> 0 <NA>
#> 3 <chr [1]> <NA> "Error in .Primitive(\"sum\")(..., na.rm = na.rm): invali…
Created on 2023-08-21 with reprex v2.0.2
The problem is that you need vectors of length 1, instead of arbitrary objects which we generally think of a scalars (e.g. a linear model or error object only ever represents a single model or a single error). So you can always fix this problem by wrapping that object inside another list:
library(tidyverse)
data <- tibble(input = list(a = 1:3, b = NULL, c = "abc"))
data |>
mutate(output = map(input, safely(sum))) |>
mutate(output = map(output, \(x) list(result = x$result, error = list(x$error)))) |>
unnest_wider(output)
#> # A tibble: 3 × 3
#> input result error
#> <named list> <int> <list>
#> 1 <int [3]> 6 <list [1]>
#> 2 <NULL> 0 <list [1]>
#> 3 <chr [1]> NA <list [1]>
Created on 2023-11-01 with reprex v2.0.2
Another way of saying the same thing is that if you need a list-column to represent the unnested data (e.g. it's a list of linear models or a list of errors), then you'll currently need to make sure that your data is already a list.
It does seem like tidyr could handle this for you (perhaps with an explicit option).