friendlyeval icon indicating copy to clipboard operation
friendlyeval copied to clipboard

Unexpected behaviour when passing string to be evaluated as expression

Open leungi opened this issue 7 years ago • 2 comments
trafficstars

Below is a reprex.

The goal of function is to allow custom variables and functions for a group_by() + summarise().

In many tidyeval examples I see, authors used ... as function arguments to to do this; however, I'd like to have an explicit definition of function to be passed to summarise().

If there's a better way to do this, please enlighten me!

library(dplyr)
library(friendlyeval)

## this works
GroupNSummarise <- function(.data, .var, .group, .fun) {
  group <- treat_strings_as_cols(.group)
  func <- treat_string_as_expr(.fun)
  var <- treat_strings_as_cols(.var)
  
  .data %>% 
    group_by(!!!group) %>% 
    summarise_at(vars(.var), ~ eval(func))
}

mtcars %>% 
  GroupNSummarise(.var = c("mpg", "wt"),
                  .group = c("cyl"),
                  .fun = 'mean(., na.rm = TRUE)')
#> Warning: package 'bindrcpp' was built under R version 3.4.4
#> # A tibble: 3 x 3
#>     cyl   mpg    wt
#>   <dbl> <dbl> <dbl>
#> 1     4  26.7  2.29
#> 2     6  19.7  3.12
#> 3     8  15.1  4.00

## this fails
GroupNSummarise2 <- function(.data, .var, .group, .fun) {
  group <- treat_strings_as_cols(.group)
  func <- treat_string_as_expr(.fun)
  var <- treat_strings_as_cols(.var)
  
  .data %>% 
    group_by(!!!group) %>% 
    summarise_at(vars(.var), ~ !!(func))
}

mtcars %>% 
  GroupNSummarise2(.var = c("mpg", "wt"),
                  .group = c("cyl"),
                  .fun = 'mean(., na.rm = TRUE)')
#> Error in summarise_impl(.data, dots): Evaluation error: invalid argument type.

leungi avatar Sep 14 '18 17:09 leungi

Thanks for this example!

So the first case is pretty interesting to me. At this time I can't explain how it works!

One thing to think about is that to use the . notation with summarise_at you need to end up passing a formula, and a formula is already a way of 'quoting' code, so why not have it appear at the top level? E.g:

GroupNSummarise3 <- function(.data, .var, .group, .fun) {
  group <- treat_strings_as_cols(.group)
  var <- treat_strings_as_cols(.var)
  
  .data %>% 
    group_by(!!!group) %>% 
    summarise_at(vars(.var), .fun)
}


mtcars %>% 
  GroupNSummarise3(.var = c("mpg", "wt"),
                   .group = c("cyl"),
                   .fun = ~mean(., na.rm = TRUE))

If you absolutely need to construct a formula from a string then this seems to work:

GroupNSummarise4 <- function(.data, .var, .group, .fun) {
  group <- treat_strings_as_cols(.group)
  var <- treat_strings_as_cols(.var)
  fun <- rlang::new_formula(lhs = NULL, 
                            rhs = treat_string_as_expr(.fun))
  
  .data %>% 
    group_by(!!!group) %>% 
    summarise_at(vars(.var), fun)
}


mtcars %>% 
  GroupNSummarise4(.var = c("mpg", "wt"),
                   .group = c("cyl"),
                   .fun = 'mean(., na.rm = TRUE)')

This could be wrapped up a little more nicely into something like treat_string_as_formula() although I note that when the formula option for a function argument appears the convention is to usually offer it as an alternative to an ordinary closure. So I'm unsure how general this would be.

MilesMcBain avatar Sep 15 '18 02:09 MilesMcBain

Thanks for prompt response @MilesMcBain !

I agree most would avoid specifying arguments as strings, so GroupNSummarise3() is the way to go.

Appreciate the useful tip - "a formula is already a way of 'quoting' code" :+1:

leungi avatar Sep 15 '18 19:09 leungi