dplyr icon indicating copy to clipboard operation
dplyr copied to clipboard

In `select` `!` and `-` work differently

Open Moohan opened this issue 1 year ago • 1 comments
trafficstars

I'm not sure if this is intended behaviour but it came up in a recent training session and wasn't what we expected.

! and - seem interchangeable when used with c(...) but when prefix variables specified to select through ..., ! doesn't work. Hopefully the below reprex makes this clear.


library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# Works
starwars |> 
  select(-name, -height)
#> # A tibble: 87 × 12
#>     mass hair_color    skin_color  eye_color birth_year sex    gender  homeworld
#>    <dbl> <chr>         <chr>       <chr>          <dbl> <chr>  <chr>   <chr>    
#>  1    77 blond         fair        blue            19   male   mascul… Tatooine 
#>  2    75 <NA>          gold        yellow         112   none   mascul… Tatooine 
#>  3    32 <NA>          white, blue red             33   none   mascul… Naboo    
#>  4   136 none          white       yellow          41.9 male   mascul… Tatooine 
#>  5    49 brown         light       brown           19   female femini… Alderaan 
#>  6   120 brown, grey   light       blue            52   male   mascul… Tatooine 
#>  7    75 brown         light       blue            47   female femini… Tatooine 
#>  8    32 <NA>          white, red  red             NA   none   mascul… Tatooine 
#>  9    84 black         light       brown           24   male   mascul… Tatooine 
#> 10    77 auburn, white fair        blue-gray       57   male   mascul… Stewjon  
#> # ℹ 77 more rows
#> # ℹ 4 more variables: species <chr>, films <list>, vehicles <list>,
#> #   starships <list>

# Doesn't work
starwars |> 
  select(!name, !height)
#> # A tibble: 87 × 14
#>    height  mass hair_color    skin_color  eye_color birth_year sex    gender   
#>     <int> <dbl> <chr>         <chr>       <chr>          <dbl> <chr>  <chr>    
#>  1    172    77 blond         fair        blue            19   male   masculine
#>  2    167    75 <NA>          gold        yellow         112   none   masculine
#>  3     96    32 <NA>          white, blue red             33   none   masculine
#>  4    202   136 none          white       yellow          41.9 male   masculine
#>  5    150    49 brown         light       brown           19   female feminine 
#>  6    178   120 brown, grey   light       blue            52   male   masculine
#>  7    165    75 brown         light       blue            47   female feminine 
#>  8     97    32 <NA>          white, red  red             NA   none   masculine
#>  9    183    84 black         light       brown           24   male   masculine
#> 10    182    77 auburn, white fair        blue-gray       57   male   masculine
#> # ℹ 77 more rows
#> # ℹ 6 more variables: homeworld <chr>, species <chr>, films <list>,
#> #   vehicles <list>, starships <list>, name <chr>

# Works
starwars |> 
  select(-c(name, height))
#> # A tibble: 87 × 12
#>     mass hair_color    skin_color  eye_color birth_year sex    gender  homeworld
#>    <dbl> <chr>         <chr>       <chr>          <dbl> <chr>  <chr>   <chr>    
#>  1    77 blond         fair        blue            19   male   mascul… Tatooine 
#>  2    75 <NA>          gold        yellow         112   none   mascul… Tatooine 
#>  3    32 <NA>          white, blue red             33   none   mascul… Naboo    
#>  4   136 none          white       yellow          41.9 male   mascul… Tatooine 
#>  5    49 brown         light       brown           19   female femini… Alderaan 
#>  6   120 brown, grey   light       blue            52   male   mascul… Tatooine 
#>  7    75 brown         light       blue            47   female femini… Tatooine 
#>  8    32 <NA>          white, red  red             NA   none   mascul… Tatooine 
#>  9    84 black         light       brown           24   male   mascul… Tatooine 
#> 10    77 auburn, white fair        blue-gray       57   male   mascul… Stewjon  
#> # ℹ 77 more rows
#> # ℹ 4 more variables: species <chr>, films <list>, vehicles <list>,
#> #   starships <list>

# Works
starwars |> 
  select(!c(name, height))
#> # A tibble: 87 × 12
#>     mass hair_color    skin_color  eye_color birth_year sex    gender  homeworld
#>    <dbl> <chr>         <chr>       <chr>          <dbl> <chr>  <chr>   <chr>    
#>  1    77 blond         fair        blue            19   male   mascul… Tatooine 
#>  2    75 <NA>          gold        yellow         112   none   mascul… Tatooine 
#>  3    32 <NA>          white, blue red             33   none   mascul… Naboo    
#>  4   136 none          white       yellow          41.9 male   mascul… Tatooine 
#>  5    49 brown         light       brown           19   female femini… Alderaan 
#>  6   120 brown, grey   light       blue            52   male   mascul… Tatooine 
#>  7    75 brown         light       blue            47   female femini… Tatooine 
#>  8    32 <NA>          white, red  red             NA   none   mascul… Tatooine 
#>  9    84 black         light       brown           24   male   mascul… Tatooine 
#> 10    77 auburn, white fair        blue-gray       57   male   mascul… Stewjon  
#> # ℹ 77 more rows
#> # ℹ 4 more variables: species <chr>, films <list>, vehicles <list>,
#> #   starships <list>

Moohan avatar Apr 29 '24 11:04 Moohan

This is not a bug, but due to the way tidyselect works: https://tidyselect.r-lib.org/articles/syntax.html#bare-names

starwars |> 
  select(!name, !height)

when using multiple expressions in select, (...), they get evaluated separately and then combined with c(). So in this case, !name selects every column besides name and !height selects every column besides height. After concatenating the two, you have every column name

kylebutts avatar May 19 '24 00:05 kylebutts