dplyr icon indicating copy to clipboard operation
dplyr copied to clipboard

`mutate()` preserves attributes of grouped data frames

Open romainfrancois opened this issue 3 years ago • 5 comments

closes #6100

library(dplyr, warn.conflicts = FALSE)

library(purrr)
attr(mtcars, "test") <- "foo"
mtcars_grouped <- group_by(mtcars, cyl)

mtcars %>% 
  mutate(new_col = 1) %>% 
  attr("test")
#> [1] "foo"

mtcars_grouped %>% 
  mutate(new_col = 1) %>% 
  attr("test")
#> [1] "foo"

Created on 2021-11-29 by the reprex package (v2.0.1.9000)

romainfrancois avatar Nov 29 '21 11:11 romainfrancois

Are you thinking of this for 1.0.8 or 1.1.0?

hadley avatar Dec 01 '21 17:12 hadley

I believe this should should be 1.0.8, and then we can think of something more intentional for 1.1.0: inform or warn about unknown attributes, hint about creating a custom class that deals with them, ...

What we have right now "keeping attributes only for ungrouped data frames" is surprising, as in #6100

romainfrancois avatar Dec 02 '21 08:12 romainfrancois

I do wonder if we should hold off on this until dplyr 1.1.0. We have the same issue with dplyr_row_slice()

library(dplyr)

df <- data.frame(x = 1)
gdf <- group_by(df, x)

attr(df, "keepme") <- "foo"
attr(gdf, "keepme") <- "foo"

# kept
attr(slice(df, 1), "keepme")
#> [1] "foo"

# dropped
attr(slice(gdf, 1), "keepme")
#> NULL

I think the whole generic API here needs another round of thinking through to tighten it up, and I'm a little worried that fixing something here would only be a partial solution.

DavisVaughan avatar Dec 02 '21 15:12 DavisVaughan

🤔 yeah perhaps as a follow up to this, dplyr_row_slice.* should all call dplyr_reconstruct_core() on the result of vec_slice() and then reconstruct the groups.

But otherwise, I don't mind waiting for 1.1 so that we can design this more.

romainfrancois avatar Dec 02 '21 16:12 romainfrancois

Yeah, I'd rather we wait, especially since we're close to release.

hadley avatar Dec 02 '21 21:12 hadley

Let's hold off on this as it's looking like .by might provide a simpler alternative.

hadley avatar Nov 17 '22 16:11 hadley