dplyr icon indicating copy to clipboard operation
dplyr copied to clipboard

Possible regression re. use of glue in dplyr::across

Open jimjam-slam opened this issue 2 years ago • 2 comments

This is discussed in https://twitter.com/mjskay/status/1660770865087148032

Essentially, users are reporting errors when using glue inside an anonymous function given to dplyr::across(). This error did not previously exist.

On my setup, with R 4.2.2 on macOS, and with dplyr 1.0.10 and glue 1.6.2, the following works to modify two columns:

library(dplyr)
library(glue)
mtcars |> mutate(across(c(mpg, cyl), \(x) glue("Oh hai {x}")))

image

But Matthew Kay reports that they get an error, unless they add dots to the anonymous function or save it to a variable before passing it to across:

image

Cameron Patrick also gets an error on R 4.2.3 for Windows with dplyr 1.1.1 and glue 1.6.2 (the same version of glue as me). They get this using the purrr anonymous function syntax:

image

I'm not sure whether this is a problem with glue, dplyr or an interaction between the two, but as the version of glue between my successful reprex and Cameron's unsuccessful one isn't changed, I've chosen to file here.

jimjam-slam avatar May 23 '23 01:05 jimjam-slam

More precisely, the error is that glue is unable to find variables defined in the function (with both function(x) and ~ specifications) and prints out object '.x' not found. (This may help others find the issue too, I guess?)

ErdaradunGaztea avatar Jun 07 '23 11:06 ErdaradunGaztea

This is a consequence of across() inlining anonymous functions, so what you end up with is an expression that looks like this:

mtcars |> mutate(mpg = glue("Oh hai {x}"), cyl = glue("Oh hai {x}"))

This inlining is an intended feature of across(). Ideally we'd like across() to be fully handled by dplyr before handing off the generated expressions to something like dbplyr or dtplyr or arrow (so they don't need to know about across()), and this is part of how that would work.

We have't fully decided what to do about this particular issue yet, but if this is annoying you then in the meantime you can always create a separate (not anonymous) function and supply that instead. That will always work.

library(dplyr)
library(glue)

fn <- function(x) {
  glue("Oh hai {x}")
}

mtcars |> mutate(across(c(mpg, cyl), fn))

DavisVaughan avatar Nov 03 '23 17:11 DavisVaughan