tune icon indicating copy to clipboard operation
tune copied to clipboard

select_by_*() silently ignores ordering selection if it is set as a string

Open EmilHvitfeldt opened this issue 3 years ago • 1 comments

If you set "K" instead of K, it creates and sorts by the constant string "K" effectively ignoring it. This should ideally throw an warning/error

library(tune)
data("example_ames_knn")

select_by_one_std_err(ames_grid_search, metric = "rmse", K)
#> # A tibble: 1 × 13
#>       K weigh…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err .config
#>   <int> <chr>     <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl> <chr>  
#> 1     5 rank      0.411     2     7 rmse    standa… 0.0740    10 0.00328 Prepro…
#> # … with 2 more variables: .best <dbl>, .bound <dbl>, and abbreviated variable
#> #   names ¹​weight_func, ²​dist_power, ³​.estimator
#> # ℹ Use `colnames()` to see all variable names
select_by_one_std_err(ames_grid_search, metric = "rmse", "K")
#> # A tibble: 1 × 13
#>       K weigh…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err .config
#>   <int> <chr>     <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl> <chr>  
#> 1    21 cos       0.626     1     4 rmse    standa… 0.0746    10 0.00359 Prepro…
#> # … with 2 more variables: .best <dbl>, .bound <dbl>, and abbreviated variable
#> #   names ¹​weight_func, ²​dist_power, ³​.estimator
#> # ℹ Use `colnames()` to see all variable names
select_by_one_std_err(ames_grid_search, metric = "rmse", 1)
#> # A tibble: 1 × 13
#>       K weigh…¹ dist_…²   lon   lat .metric .esti…³   mean     n std_err .config
#>   <int> <chr>     <dbl> <int> <int> <chr>   <chr>    <dbl> <int>   <dbl> <chr>  
#> 1    21 cos       0.626     1     4 rmse    standa… 0.0746    10 0.00359 Prepro…
#> # … with 2 more variables: .best <dbl>, .bound <dbl>, and abbreviated variable
#> #   names ¹​weight_func, ²​dist_power, ³​.estimator
#> # ℹ Use `colnames()` to see all variable names

Created on 2022-07-26 by the reprex package (v2.0.1)

EmilHvitfeldt avatar Jul 26 '22 19:07 EmilHvitfeldt

Since the ... are passed on to arrange() and I guess we want things like desc(K) to work, there isn't a whole lot we can do here.

arrange() is data-masking and mutate-like, so we can't make the ... of select_by_one_std_err() tidyselect-like like I did with yardstick's metric functions like in https://github.com/tidymodels/yardstick/pull/328

The only thing we could do is iterate over the dots we capture with dots <- rlang::enquos(...) and check that quo_is_symbol(quo) || quo_is_call(quo). If that isn't true, then the user probably provided a literal value like "K" or 1 and we should error there

DavisVaughan avatar Oct 27 '22 20:10 DavisVaughan