tidyr icon indicating copy to clipboard operation
tidyr copied to clipboard

pivot doesn't preserve attributes

Open ilikegitlab opened this issue 3 years ago • 3 comments

It seems most (I tested filter, select, left_join, mutate, head) dplyr functions happily (and nicely) copy over attributes with metadata. For some reason, pivot functions do not:

new_tibble(tibble(a=3,b=4),metadata="test") %>% validate_tibble() %>% attr("metadata") [1] "test" but: new_tibble(tibble(a=3,b=4),metadata="test") %>% validate_tibble() %>% pivot_longer(cols=c(a,b)) %>% attr("metadata") NULL

This may be related to the discussions about extending tibbles, but I feel redefining the class adds lots of complexity for a simple user attribute (and I honestly couldn't make much sense of github.com/tidyverse/tibble/issues/275, and it being open means it is still not clarified I guess).

ilikegitlab avatar Jul 17 '22 18:07 ilikegitlab

Can you please provide a minimal reprex (reproducible example)? The goal of a reprex is to make it as easy as possible for me to recreate your problem so that I can fix it: please help me help you! If you've never heard of a reprex before, start by reading about the reprex package, including the advice further down the page. Please make sure your reprex is created with the reprex package as it gives nicely formatted output and avoids a number of common pitfalls.

hadley avatar Oct 10 '22 21:10 hadley

library(tidyverse)

new_tibble(tibble(a=3, b=4), metadata="test") %>% validate_tibble() %>% attr("metadata") #> [1] "test"

new_tibble(tibble(a=3, b=4),metadata="test") %>% validate_tibble() %>% pivot_longer(cols=c(a, b)) %>% attr("metadata")
#> NULL

ilikegitlab avatar Oct 11 '22 11:10 ilikegitlab

Somewhat more minimal reprex:

library(tidyr)

df <- tibble(a = 1)
attr(df, "metadata") <- "test"

df |>
  pivot_longer(a) |>
  attr("metadata")
#> NULL

Created on 2022-10-11 with reprex v2.0.2

hadley avatar Oct 11 '22 13:10 hadley

It's not clear to me that pivot_ functions should preserve attributes — I think it's reasonable to argue that they create new data frames in a way similar to dplyr::summarise(), rather than modifying an existing data frame like dplyr::mutate().

hadley avatar Oct 18 '22 14:10 hadley

Why not label this as a design decision (attr are not supported with dplyr)? At least that would be clear.

I don't think it's reasonable that in a pipeline I'm loosing metadata depending on what functions i'm using. I cannot really follow your argument as here I'm just reshaping the same numbers after all (but then, I would even make a case for summarize). But when I join data together, suddenly attr of one of them are there? Maybe it makes sense to you.

ilikegitlab avatar Oct 18 '22 15:10 ilikegitlab

I'm not so sure if it adresses the same problem, but for me pivot_longer() does an amazing job at preserving attributes of variables. But I think preserving the attributes depends on supplying a class attribute when several columns are combined to one.

I'm afraid the following minimal example is not fully minimalistic, but I'll do my best:

library(tidyr)

wide <- 
  structure(
  list(
    x_1 = structure(c(1), label = "X"),
    y_1 = structure(c(2), label = "Y", labels = c(A = 1), class = c("haven_labelled", "vctrs_vctr", "double")),
    y_2 = structure(c(3), label = "Y", labels = c(A = 1), class = c("haven_labelled", "vctrs_vctr", "double")),
    z_1 = structure(c(2), label = "Z"),
    z_3 = structure(c(3), label = "Z")
  ), 
  row.names = c(NA, -1L), 
  class = c("tbl_df", "tbl", "data.frame"
  ))

long <- 
  wide %>% 
  pivot_longer(everything(), names_sep = "_", names_to=c('.value', 'wave'))

attributes(long$x)
#> $label
#> [1] "X"
attributes(long$y)
#> $label
#> [1] "Y"
#> 
#> $labels
#> A 
#> 1 
#> 
#> $class
#> [1] "haven_labelled" "vctrs_vctr"     "double"
attributes(long$z)
#> NULL

attributes of "x" are preserved, because there is only one occurrence.

Attributes of "y" are preserved - both variables are merged into one, with class provided.

Attributes of "z" are lost - both variables merged into one, without class provided.

helge-baumann avatar Oct 26 '22 12:10 helge-baumann