`names<-.tbl_df` has different behavior inside vs outside RStudio
It looks like this is a leftover workaround in tibble from RStudio 1.1: https://github.com/tidyverse/tibble/blob/490cd3774e32fee08cdb73e79c46488b11ee843e/R/tbl_df.R#L81
This results in significantly different behavior when running inside vs outside RStudio. Example:
tibble::tibble(x = 1, y = 2) |> unname() |> names()
#> NULL
Sys.unsetenv("RSTUDIO") # Make it look like we're running outside RStudio
tibble::tibble(x = 1, y = 2) |> unname() |> names()
#> [1] NA NA
For me, this caused some pretty unexpected behavior where I was getting different results when a test case was being run via testthat (which will run some tests in an env without RSTUDIO set) vs interactively... took quite a bit of effort to track down.
I'm assuming RStudio v1.1 isn't supported anymore -- is it still necessary in modern RStudio? Would you consider removing the workaround to avoid this brittle behavior? If not, maybe we could add a warning when the workaround is triggered?
Note that also the first case (with the workaround triggered) it creates an unprintable tibble:
tibble::tibble(x = 1, y = 2) |> unname() |> print()
#> Error in names[old] <- names(x)[j[old]] : replacement has length zero
Sys.unsetenv("RSTUDIO") # Make it look like we're running outside RStudio
tibble::tibble(x = 1, y = 2) |> unname() |> print()
#> # A tibble: 1 × 2
#> `` ``
#> <dbl> <dbl>
#> 1 1 2
Thanks for filing this.
The workaround enables setting the names of a tibble to NULL, which isn't really well supported. The non-RStudio behavior is to assign NA to the names.
I suspect that disallowing this altogether is an uphill battle. Would you like to share your use case? Why is it important to be able to call unname() on a tibble?
The use case was around transposing a tibble into a list of row-lists, where I wanted each row to be unnamed. The following works as expected in RStudio:
Sys.setenv(RSTUDIO = "1")
tibble::tibble(
foo = c(1, 2),
bar = c(3, 4)
) |>
unname() |>
purrr::transpose()
#> [[1]]
#> [[1]][[1]]
#> [1] 1
#>
#> [[1]][[2]]
#> [1] 3
#>
#>
#> [[2]]
#> [[2]][[1]]
#> [1] 2
#>
#> [[2]][[2]]
#> [1] 4
But of course it results in lists with NA names outside of RStudio:
Sys.unsetenv("RSTUDIO")
tibble::tibble(
foo = c(1, 2),
bar = c(3, 4)
) |>
unname() |>
purrr::transpose()
#> [[1]]
#> [[1]]$<NA>
#> [1] 1
#>
#> [[1]]$<NA>
#> [1] 3
#>
#>
#> [[2]]
#> [[2]]$<NA>
#> [1] 2
#>
#> [[2]]$<NA>
#> [1] 4
I discovered this because it resulted in weird downstream behaviors that caused my code to break in unit tests via testthat. You can imagine my surprise when I discovered this "branch on RSTUDIO" condition that was causing the mismatch :P
It's easily fixable by converting to a list before unnaming:
tibble::tibble(
foo = c(1, 2),
bar = c(3, 4)
) |>
as.list() |>
unname() |>
purrr::transpose()
I realize this is pretty ingrained behavior by this point, but I think we need better guards here (at least a warning?) to prevent other folks from going down this rabbit hole like I did.