tune icon indicating copy to clipboard operation
tune copied to clipboard

`tune::collect_metrics()` sorts metrics in a different order than `yardstick::metrics_set()`

Open mikemahoney218 opened this issue 2 years ago • 1 comments

The problem

With current HEAD, tune::collect_metrics() appears to return metrics in alphabetical order, while yardstick::metrics_set() returns metrics in the order they were initially provided.

This isn't a big deal, but is an inconsistency that I've needed to point out when teaching ("look, you get the same set of metrics from either function -- they're just sorted differently"). Would it make sense to standardize on one of these methods for ordering metrics?

Reproducible example

sim_data <- modeldata::sim_regression(1000)
sim_formula <- reformulate(
  grep("predictor", names(sim_data), value = TRUE), 
  "outcome"
)

sim_model <- lm(sim_formula, sim_data)
sim_data$predictions <- predict(sim_model, sim_data)

metrics <- yardstick::metric_set(
  yardstick::rmse, yardstick::mae, yardstick::huber_loss_pseudo
)
metrics(sim_data, truth = outcome, estimate = predictions)
#> # A tibble: 3 × 3
#>   .metric           .estimator .estimate
#>   <chr>             <chr>          <dbl>
#> 1 rmse              standard        18.6
#> 2 mae               standard        13.5
#> 3 huber_loss_pseudo standard        12.6

cv_folds <- rsample::vfold_cv(sim_data)
lm_spec <- parsnip::linear_reg()
lm_workflow <- workflows::workflow() |> 
  workflows::add_model(lm_spec) |> 
  workflows::add_formula(sim_formula)
lm_workflow |> 
  tune::fit_resamples(cv_folds, metrics = metrics) |> 
  tune::collect_metrics()
#> # A tibble: 3 × 6
#>   .metric           .estimator  mean     n std_err .config             
#>   <chr>             <chr>      <dbl> <int>   <dbl> <chr>               
#> 1 huber_loss_pseudo standard    13.0    10   0.479 Preprocessor1_Model1
#> 2 mae               standard    13.9    10   0.482 Preprocessor1_Model1
#> 3 rmse              standard    18.9    10   0.690 Preprocessor1_Model1

Created on 2023-10-12 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.1 (2023-06-16)
#>  os       Ubuntu 22.04.3 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/New_York
#>  date     2023-10-12
#>  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date (UTC) lib source
#>  class          7.3-22     2023-05-03 [1] CRAN (R 4.3.1)
#>  cli            3.6.1      2023-03-23 [1] CRAN (R 4.3.0)
#>  codetools      0.2-19     2023-02-01 [1] CRAN (R 4.3.0)
#>  colorspace     2.1-0      2023-01-23 [1] CRAN (R 4.3.0)
#>  data.table     1.14.8     2023-02-17 [1] CRAN (R 4.3.0)
#>  dials          1.2.0      2023-04-03 [1] CRAN (R 4.3.0)
#>  DiceDesign     1.9        2021-02-13 [1] CRAN (R 4.3.0)
#>  digest         0.6.33     2023-07-07 [1] CRAN (R 4.3.1)
#>  dplyr          1.1.2      2023-04-20 [1] CRAN (R 4.3.0)
#>  evaluate       0.22       2023-09-29 [1] CRAN (R 4.3.1)
#>  fansi          1.0.5      2023-10-08 [1] CRAN (R 4.3.1)
#>  fastmap        1.1.1      2023-02-24 [1] CRAN (R 4.3.0)
#>  foreach        1.5.2      2022-02-02 [1] CRAN (R 4.3.0)
#>  fs             1.6.3      2023-07-20 [1] CRAN (R 4.3.1)
#>  furrr          0.3.1      2022-08-15 [1] CRAN (R 4.3.0)
#>  future         1.33.0     2023-07-01 [1] CRAN (R 4.3.1)
#>  future.apply   1.11.0     2023-05-21 [1] CRAN (R 4.3.0)
#>  generics       0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
#>  ggplot2        3.4.3      2023-08-14 [1] CRAN (R 4.3.1)
#>  globals        0.16.2     2022-11-21 [1] CRAN (R 4.3.0)
#>  glue           1.6.2      2022-02-24 [1] CRAN (R 4.3.0)
#>  gower          1.0.1      2022-12-22 [1] CRAN (R 4.3.0)
#>  GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.3.0)
#>  gtable         0.3.4      2023-08-21 [1] CRAN (R 4.3.1)
#>  hardhat        1.3.0.9000 2023-06-01 [1] Github (tidymodels/hardhat@ac2dfd0)
#>  htmltools      0.5.6.1    2023-10-06 [1] CRAN (R 4.3.1)
#>  ipred          0.9-14     2023-03-09 [1] CRAN (R 4.3.0)
#>  iterators      1.0.14     2022-02-05 [1] CRAN (R 4.3.0)
#>  knitr          1.44       2023-09-11 [1] CRAN (R 4.3.1)
#>  lattice        0.21-9     2023-10-01 [1] CRAN (R 4.3.1)
#>  lava           1.7.2.1    2023-02-27 [1] CRAN (R 4.3.0)
#>  lhs            1.1.6      2022-12-17 [1] CRAN (R 4.3.0)
#>  lifecycle      1.0.3      2022-10-07 [1] CRAN (R 4.3.0)
#>  listenv        0.9.0      2022-12-16 [1] CRAN (R 4.3.0)
#>  lubridate      1.9.2      2023-02-10 [1] CRAN (R 4.3.0)
#>  magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
#>  MASS           7.3-60     2023-05-04 [1] CRAN (R 4.3.1)
#>  Matrix         1.6-1.1    2023-09-18 [1] CRAN (R 4.3.1)
#>  modeldata      1.1.0      2023-01-25 [1] CRAN (R 4.3.0)
#>  modelenv       0.1.1      2023-03-08 [1] CRAN (R 4.3.0)
#>  munsell        0.5.0      2018-06-12 [1] CRAN (R 4.3.0)
#>  nnet           7.3-19     2023-05-03 [1] CRAN (R 4.3.1)
#>  parallelly     1.36.0     2023-05-26 [1] CRAN (R 4.3.0)
#>  parsnip      * 1.1.0.9002 2023-06-01 [1] Github (tidymodels/parsnip@145bac2)
#>  pillar         1.9.0      2023-03-22 [1] CRAN (R 4.3.0)
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
#>  prodlim        2023.03.31 2023-04-02 [1] CRAN (R 4.3.0)
#>  purrr          1.0.1      2023-01-10 [1] CRAN (R 4.3.0)
#>  R.cache        0.16.0     2022-07-21 [1] CRAN (R 4.3.0)
#>  R.methodsS3    1.8.2      2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo           1.25.0     2022-06-12 [1] CRAN (R 4.3.0)
#>  R.utils        2.12.2     2022-11-11 [1] CRAN (R 4.3.0)
#>  R6             2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
#>  Rcpp           1.0.11     2023-07-06 [1] CRAN (R 4.3.1)
#>  recipes        1.0.6.9000 2023-06-01 [1] Github (tidymodels/recipes@a5e53e2)
#>  reprex         2.0.2      2022-08-17 [1] CRAN (R 4.3.0)
#>  rlang          1.1.1      2023-04-28 [1] CRAN (R 4.3.0)
#>  rmarkdown      2.23       2023-07-01 [1] CRAN (R 4.3.1)
#>  rpart          4.1.19     2022-10-21 [1] CRAN (R 4.3.0)
#>  rsample        1.2.0.9000 2023-10-12 [1] Github (tidymodels/rsample@be593b9)
#>  rstudioapi     0.15.0     2023-07-07 [1] CRAN (R 4.3.1)
#>  scales         1.2.1      2022-08-20 [1] CRAN (R 4.3.0)
#>  sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
#>  styler         1.10.1     2023-06-05 [1] CRAN (R 4.3.1)
#>  survival       3.5-5      2023-03-12 [1] CRAN (R 4.3.0)
#>  tibble         3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
#>  tidyr          1.3.0      2023-01-24 [1] CRAN (R 4.3.0)
#>  tidyselect     1.2.0      2022-10-10 [1] CRAN (R 4.3.0)
#>  timechange     0.2.0      2023-01-11 [1] CRAN (R 4.3.0)
#>  timeDate       4022.108   2023-01-07 [1] CRAN (R 4.3.0)
#>  tune           1.1.2.9000 2023-10-12 [1] Github (tidymodels/tune@4e34edc)
#>  utf8           1.2.3      2023-01-31 [1] CRAN (R 4.3.0)
#>  vctrs          0.6.3      2023-06-14 [1] CRAN (R 4.3.0)
#>  withr          2.5.1      2023-09-26 [1] CRAN (R 4.3.1)
#>  workflows      1.1.3.9000 2023-06-01 [1] Github (tidymodels/workflows@882b06b)
#>  xfun           0.40       2023-08-09 [1] CRAN (R 4.3.1)
#>  yaml           2.3.7      2023-01-23 [1] CRAN (R 4.3.0)
#>  yardstick      1.2.0.9001 2023-10-12 [1] Github (tidymodels/yardstick@6c0b76f)
#> 
#>  [1] /home/mikemahoney218/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

mikemahoney218 avatar Oct 12 '23 13:10 mikemahoney218

Heard. I agree that we ought to look into this!

Just wanted to add here that this is also the case with CRAN tune, so this wasn't an unintended side effect of https://github.com/tidymodels/tune/pull/730.

simonpcouch avatar Oct 23 '23 12:10 simonpcouch