tidyr icon indicating copy to clipboard operation
tidyr copied to clipboard

`unnest_wider` fails on certain types

Open ArthurAndrews opened this issue 2 years ago • 3 comments

I've been struggling to unnest_wider certain results in a list column. I don't know why some types (tibbles) are correctly unnested into list columns, but others (lm) are not. Any advice? Or is this a bug?

For the third example here, I expect unnest_wider should create a list column of lm objects.

library(tidyverse)
packageVersion("tidyr")
#> [1] '1.3.0'

# works
x <- list(a = 1, b = "b")
tibble(y = list(x, x)) |> unnest_wider(y)
#> # A tibble: 2 × 2
#>       a b    
#>   <dbl> <chr>
#> 1     1 b    
#> 2     1 b

# works
x <- list(a = 1, b = "b", c = tibble(z = 1:3))
tibble(y = list(x, x)) |> unnest_wider(y)
#> # A tibble: 2 × 3
#>       a b     c               
#>   <dbl> <chr> <list>          
#> 1     1 b     <tibble [3 × 1]>
#> 2     1 b     <tibble [3 × 1]>

# doesn't work
x <- list(a = 1, b = "b", c = lm(mpg ~ wt, mtcars))
tibble(y = list(x, x)) |> unnest_wider(y)
#> Error in `unnest_wider()`:
#> ℹ In column: `y`.
#> ℹ In row: 1.
#> Caused by error in `list_sizes()`:
#> ! `x$c` must be a vector, not a <lm> object.
#> Backtrace:
#>      ▆
#>   1. ├─tidyr::unnest_wider(tibble(y = list(x, x)), y)
#>   2. │ └─tidyr:::col_to_wide(...)
#>   3. │   ├─tidyr:::with_indexed_errors(...)
#>   4. │   │ └─rlang::try_fetch(...)
#>   5. │   │   └─base::withCallingHandlers(...)
#>   6. │   └─purrr::map(...)
#>   7. │     └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#>   8. │       ├─purrr:::with_indexed_errors(...)
#>   9. │       │ └─base::withCallingHandlers(...)
#>  10. │       ├─purrr:::call_with_cleanup(...)
#>  11. │       └─tidyr (local) .f(.x[[i]], ...)
#>  12. │         └─tidyr:::elt_to_wide(...)
#>  13. │           └─vctrs::list_sizes(x)
#>  14. └─vctrs:::stop_scalar_type(`<fn>`(`<lm>`), "x$c", `<env>`)
#>  15.   └─vctrs:::stop_vctrs(...)
#>  16.     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)

Created on 2023-06-26 with reprex v2.0.2

ArthurAndrews avatar Jun 26 '23 18:06 ArthurAndrews

I get the same issue, when trying to convert the result of a purrr::safely() column to two columns:

library(tidyverse)

# Sample list with problematic function
data <- tibble(
  input = list(
    a = 1:3,
    b = NULL,
    c = "abc"
  )
)

# Apply safely function
data <- data |> 
  mutate(
    output = map(
      input, 
      safely(sum)
    )
  )

str(data$output)
#> List of 3
#>  $ a:List of 2
#>   ..$ result: int 6
#>   ..$ error : NULL
#>  $ b:List of 2
#>   ..$ result: int 0
#>   ..$ error : NULL
#>  $ c:List of 2
#>   ..$ result: NULL
#>   ..$ error :List of 2
#>   .. ..$ message: chr "invalid 'type' (character) of argument"
#>   .. ..$ call   : language .Primitive("sum")(..., na.rm = na.rm)
#>   .. ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"

# Unnest the result
data = data |> 
  unnest_wider(col = output)
#> Error in `unnest_wider()`:
#> ℹ In column: `output`.
#> ℹ In row: 3.
#> Caused by error in `list_sizes()`:
#> ! `x$error` must be a vector, not a <simpleError/error/condition> object.
#> Backtrace:
#>      ▆
#>   1. ├─tidyr::unnest_wider(data, col = output)
#>   2. │ └─tidyr:::col_to_wide(...)
#>   3. │   ├─tidyr:::with_indexed_errors(...)
#>   4. │   │ └─rlang::try_fetch(...)
#>   5. │   │   └─base::withCallingHandlers(...)
#>   6. │   └─purrr::map(...)
#>   7. │     └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
#>   8. │       ├─purrr:::with_indexed_errors(...)
#>   9. │       │ └─base::withCallingHandlers(...)
#>  10. │       ├─purrr:::call_with_cleanup(...)
#>  11. │       └─tidyr (local) .f(.x[[i]], ...)
#>  12. │         └─tidyr:::elt_to_wide(...)
#>  13. │           └─vctrs::list_sizes(x)
#>  14. └─vctrs:::stop_scalar_type(`<fn>`(`<smplErrr>`), "x$error", `<env>`)
#>  15.   └─vctrs:::stop_vctrs(...)
#>  16.     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)

Created on 2023-08-21 with reprex v2.0.2

cstepper avatar Aug 21 '23 13:08 cstepper

would be great if it would work without this detour:


# Unnest the result
data |> 
  mutate(
    output = map_depth(output, 2, as.character)
  ) |> 
  unnest_wider(col = output)
#> # A tibble: 3 × 3
#>   input        result error                                                     
#>   <named list> <chr>  <chr>                                                     
#> 1 <int [3]>    6       <NA>                                                     
#> 2 <NULL>       0       <NA>                                                     
#> 3 <chr [1]>    <NA>   "Error in .Primitive(\"sum\")(..., na.rm = na.rm): invali…

Created on 2023-08-21 with reprex v2.0.2

cstepper avatar Aug 21 '23 13:08 cstepper

The problem is that you need vectors of length 1, instead of arbitrary objects which we generally think of a scalars (e.g. a linear model or error object only ever represents a single model or a single error). So you can always fix this problem by wrapping that object inside another list:

library(tidyverse)

data <- tibble(input = list(a = 1:3, b = NULL, c = "abc"))

data |> 
  mutate(output = map(input, safely(sum))) |> 
  mutate(output = map(output, \(x) list(result = x$result, error = list(x$error)))) |> 
  unnest_wider(output)
#> # A tibble: 3 × 3
#>   input        result error     
#>   <named list>  <int> <list>    
#> 1 <int [3]>         6 <list [1]>
#> 2 <NULL>            0 <list [1]>
#> 3 <chr [1]>        NA <list [1]>

Created on 2023-11-01 with reprex v2.0.2

Another way of saying the same thing is that if you need a list-column to represent the unnested data (e.g. it's a list of linear models or a list of errors), then you'll currently need to make sure that your data is already a list.

It does seem like tidyr could handle this for you (perhaps with an explicit option).

hadley avatar Nov 01 '23 19:11 hadley