dplyr
dplyr copied to clipboard
`mutate()` preserves attributes of grouped data frames
closes #6100
library(dplyr, warn.conflicts = FALSE)
library(purrr)
attr(mtcars, "test") <- "foo"
mtcars_grouped <- group_by(mtcars, cyl)
mtcars %>%
mutate(new_col = 1) %>%
attr("test")
#> [1] "foo"
mtcars_grouped %>%
mutate(new_col = 1) %>%
attr("test")
#> [1] "foo"
Created on 2021-11-29 by the reprex package (v2.0.1.9000)
Are you thinking of this for 1.0.8 or 1.1.0?
I believe this should should be 1.0.8, and then we can think of something more intentional for 1.1.0: inform or warn about unknown attributes, hint about creating a custom class that deals with them, ...
What we have right now "keeping attributes only for ungrouped data frames" is surprising, as in #6100
I do wonder if we should hold off on this until dplyr 1.1.0. We have the same issue with dplyr_row_slice()
library(dplyr)
df <- data.frame(x = 1)
gdf <- group_by(df, x)
attr(df, "keepme") <- "foo"
attr(gdf, "keepme") <- "foo"
# kept
attr(slice(df, 1), "keepme")
#> [1] "foo"
# dropped
attr(slice(gdf, 1), "keepme")
#> NULL
I think the whole generic API here needs another round of thinking through to tighten it up, and I'm a little worried that fixing something here would only be a partial solution.
🤔 yeah perhaps as a follow up to this, dplyr_row_slice.* should all call dplyr_reconstruct_core() on the result of vec_slice() and then reconstruct the groups.
But otherwise, I don't mind waiting for 1.1 so that we can design this more.
Yeah, I'd rather we wait, especially since we're close to release.
Let's hold off on this as it's looking like .by might provide a simpler alternative.