purrr icon indicating copy to clipboard operation
purrr copied to clipboard

accumulate() issues with list output of one row

Open erictleung opened this issue 4 years ago • 1 comments

This is an odd "bug" I've encountered. It is not unexpected given the functions used. So I'm both asking if there is a better way to do this or if the functionality should be updated. If this is beyond the scope of purrr, feel free to close.

I wish to accumulate a list of unique values into a cell, but when we accumulate on a single row, it returns a single value. This single value doesn't fit with the other outputs because they are lists. This

library(purrr)
library(dplyr)

running_unique <- function(x, y) { unique(c(x, y)) }

# Example simple function use
running_unique(1, c(1, 12, 3))
#> [1]  1 12  3

# Example using accumulate
accumulate(1:3, running_unique)
#> [[1]]
#> [1] 1
#> 
#> [[2]]
#> [1] 1 2
#> 
#> [[3]]
#> [1] 1 2 3
accumulate(1, running_unique)
#> [1] 1


df <- tribble(
  ~a, ~b,
  1, 1,
  2, 1,
  2, 2
)
df
#> # A tibble: 3 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     1     1
#> 2     2     1
#> 3     2     2

df %>%
  group_by(a) %>%
  mutate(accum = accumulate(b, running_unique))
#> Error: Problem with `mutate()` input `accum`.
#> x Input `accum` must return compatible vectors across groups
#> i Input `accum` is `accumulate(b, running_unique)`.
#> i Result type for group 1 (a = 1): <double>.
#> i Result type for group 2 (a = 2): <list>.

Created on 2020-12-04 by the reprex package (v0.3.0)

If we just focus on groups with more than one row, then all is well. Notice that even with one value, it works.

df %>%
  filter(a == 2) %>%
  group_by(a) %>%
  mutate(accum = accumulate(b, running_unique))
#> # A tibble: 2 x 3
#> # Groups:   a [1]
#>       a     b accum    
#>   <dbl> <dbl> <list>   
#> 1     2     1 <dbl [1]>
#> 2     2     2 <dbl [2]>

erictleung avatar Dec 05 '20 03:12 erictleung

I faced the same issue with purrr::accumulate. It does not seem to be type stable when the accumulation function returns a single value. This is the same issue with base::Reduct(fn, x, accumulate=TRUE).

fn <- function(x,y) unique(c(x,y))
purr::accumulate(list("A", "A", "A"), fn)
#> [1] "A" "A" "A"
# The expected result is a list:
#> [[1]]
#> [1] "A"
#> [[2]]
#> [1] "A"
#> [[3]]
#> [1] "A"

I worked around it by: purr::accumulate(list("A", "A", "A"), fn) %>% as.list

dabsingh avatar Feb 11 '21 01:02 dabsingh

Somewhat more minimal reprex:

library(purrr)

str(accumulate(1:3, union))
#> List of 3
#>  $ : int 1
#>  $ : int [1:2] 1 2
#>  $ : int [1:3] 1 2 3
str(accumulate(1:2, union))
#> List of 2
#>  $ : int 1
#>  $ : int [1:2] 1 2
str(accumulate(1, union))
#>  num 1

Created on 2022-08-24 by the reprex package (v2.0.1)

This is caused by accumulate() automatic simplification:

library(purrr)

str(accumulate(1:3, `+`))
#>  int [1:3] 1 3 6
str(accumulate(1:2, `+`))
#>  int [1:2] 1 3
str(accumulate(1L, `+`))
#>  int 1

Created on 2022-08-24 by the reprex package (v2.0.1)

So we probably need to offer some way to opt-out.

hadley avatar Aug 24 '22 08:08 hadley