Use racing methods to tune xgboost models and predict home runs | Julia Silge

Models like xgboost have many tuning hyperparameters, but racing methods can help identify parameter combinations that are not performing well.

https://juliasilge.com/blog/baseball-racing/

Jul 30 '21 16:07 utterances-bot

I thought my computer was fast but tune_race_anova() showed me otherwise.

Jul 30 '21 16:07 daver787

Hi Julia,

When I run the tune_race_anova function I get the following error:

Creating pre-processing data to finalize unknown parameter: mtry
Racing will minimize the mn_log_loss metric.
Resamples are analyzed in a random order.
Error: There were no valid metrics for the ANOVA model.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
All models failed. See the `.notes` column.

What am I doing wrong? I've followed the tutorial step by step so far so I suspect there is an issue with dependencies here?

Aug 09 '21 09:08 JunaidMB

@JunaidMB Hmmmmm, there are two things that come to mind: I know I was using the development version of dials from GitHub and there was a very recent version of finetune released to CRAN. I'd check to make sure you have both of those installed. I really have got to start adding session info to my blog posts. 😬

Aug 12 '21 19:08 juliasilge

Hi Julia, It's a very useful tutorial. However, I wanted to point out that you've missed a "scales::" in the second code chunk, just before "percent" in the forth line. :)

Aug 28 '21 07:08 NickyDy

Hi @julia and @JuniadMB,

I also experienced the exact same error in my workflow set tuning and I don't understand why?

wflwset_setup <- workflow_set(
  preproc = list(
    normalized = recipe_normal,
    rm_corr = recipe_corr, 
    rm_unbalan = recipe_remove, 
    impute_mean = recipe_impute_mean, 
    impute_knn = recipe_impute_knn
  ),
  models = list(
    lm = lm_model.wf,
    glm = glm_model.wf,
    spline = spline_model.wf,
    knn = knn_model.wf,
    svm = svm_model.wf,
    RF = rf_model.wf,
    XGB = xgb_model.wf,
    CatB = catboost_model.wf
  ),
  cross = TRUE
)

```{r Tuning workflowset}
set.seed(579)

if (exists("wflwset_tune_results_cv")) rm("wflwset_tune_results_cv")

# Initializing parallel processing 
doParallel::registerDoParallel()

# Workflowset tuning

wflwset_tune_results_cv <- wflwset_setup %>%
  workflowsets::workflow_map(
    fn        = "tune_race_anova",
    resamples = cv.fold.wf,
    grid      = 15,
    metrics   = multi.metric.wf, #
    verbose   = TRUE
  )

# Terminating parallel session
parallelStop()

i	No tuning parameters. `fit_resamples()` will be attempted
i  1 of 35 resampling: normalized_lm
Warning: All models failed. See the `.notes` column.
x  1 of 35 resampling: normalized_lm failed with preprocessor 1/1, model 1/1: Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...): 0 (non-NA) cases
i  2 of 35 tuning:     normalized_glm
Warning: All models failed. See the `.notes` column.
x  2 of 35 tuning:     normalized_glm failed with: There were no valid metrics for the ANOVA model.
i	No tuning parameters. `fit_resamples()` will be attempted
i  3 of 35 resampling: normalized_knn
Warning: All models failed. See the `.notes` column.
x  3 of 35 resampling: normalized_knn failed with preprocessor 1/1, model 1/1: Error in best[1, 2]: subscript out of bounds
i	No tuning parameters. `fit_resamples()` will be attempted
i  4 of 35 resampling: normalized_svm
Warning: All models failed. See the `.notes` column.
x  4 of 35 resampling: normalized_svm failed with preprocessor 1/1, model 1/1: Error in if (any(co)) {: missing value where TRUE/FALSE needed
i  5 of 35 tuning:     normalized_RF
i Creating pre-processing data to finalize unknown parameter: mtry

Sep 09 '21 14:09 kamaulindhardt

@kamaulindhardt It looks like your models are failing, just to fit in the first place (which is why you can't then do an ANOVA model on the results). I would try fitting some of those workflows individually outside of the workflowset, to debug which one is the problem and why.

Sep 09 '21 21:09 juliasilge

Thank you @juliasilge,

I am trying to fit the individual models separately and find it difficult to interpret the issue. As the error messages are, for example here with my knn model: "Error: Problem with `mutate()` column `.row`. ℹ `.row = orig_rows`. ℹ `.row` must be size 37 or 1, not 40." What does that mean? I cannot find information online.

From the recipe:

base_recipe <- 
  recipe(formula = logRR ~ ., data = af.train.wf) %>%
  update_role(Latitude,
              Longitude,
              new_role = "sample ID") %>% 
  step_zv(all_predictors(), skip = TRUE) %>% # remove any columns with a single unique value
  step_normalize(all_numeric_predictors(), skip = TRUE) # normalize numeric data: standard deviation of one and a mean of zero.

filter_recipe <- 
   base_recipe %>% 
   step_corr(all_numeric_predictors(), threshold = 0.8, skip = TRUE)

Model spec

knn_spec <- 
   nearest_neighbor(neighbors = tune(), 
                    weight_func = tune()) %>% 
   set_engine("kknn") %>% 
   set_mode("regression")

Model tuning with tune_grid()

knn_fit <- tune_grid(knn_spec,
              preprocessor = filter_recipe,
              resamples = cv.fold.wf,
              metrics = multi.metric.wf)

knn_fit

Error(s):

Warning: This tuning result has notes. Example notes on model fitting include:
preprocessor 1/1, model 5/10 (predictions): Error: Problem with `mutate()` column `.row`.
ℹ `.row = orig_rows`.
ℹ `.row` must be size 37 or 1, not 40.
preprocessor 1/1, model 1/10 (predictions): Error: Problem with `mutate()` column `.row`.
ℹ `.row = orig_rows`.
ℹ `.row` must be size 37 or 1, not 40.
preprocessor 1/1, model 2/10 (predictions): Error: Problem with `mutate()` column `.row`.
ℹ `.row = orig_rows`.
ℹ `.row` must be size 39 or 1, not 40.
# Tuning results
# 10-fold cross-validation

Sep 13 '21 17:09 kamaulindhardt

It's hard to say without a reprex but I am guessing your problem is using skip = TRUE here, where you are not applying some steps to new data. You can check out this discussion of what skipping steps for new data means.

Sep 13 '21 17:09 juliasilge

I now added an imputation stepstep_impute_mean(all_predictors()) in the recipe, and that seems to work:

base_recipe <- 
  recipe(formula = logRR ~ ., data = af.train.wf) %>%
  step_impute_mean(all_predictors())
  update_role(Latitude,
              Longitude,
              new_role = "sample ID") %>% 
  step_zv(all_predictors(), skip = TRUE) %>% # remove any columns with a single unique value
  step_normalize(all_numeric_predictors(), skip = TRUE) # normalize numeric data: standard deviation of one and a mean of zero.

filter_recipe <- 
   base_recipe %>% 
   step_corr(all_numeric_predictors(), threshold = 0.8, skip = TRUE)

How come Random Forest and kNN models cannot cope with missing values? I thought at least RF was desifor dealing with missing values. On the other hand my XGBoost models don't seem to be bothered..(?)

Thank you!

Sep 13 '21 17:09 kamaulindhardt

@kamaulindhardt Again, it's hard to say without a reprex but now it's look to me that you aren't using anything past step_impute_mean() because you don't have a %>% at the end of that line. This model is probably succeeding because you are no longer trying to use the skip = TRUE steps; using skip = TRUE for steps like step_normalize() is a pretty bad idea. I suggest reading through the sections I linked above to understanding what skipping steps for new data means.

I also recommend creating a small, self-contained reproducible example to ask for help. Truly, people are just guessing if you don't do this. I know that creating a reprex can feel like a lot of work, but we have found that it is really the only way for someone who needs help online to reliably get the right answer. If you ask a question online without creating a reprex, think of yourself as just blindly flailing in the dark; when you ask a question with creating a reprex that demonstrates your problem, then think of yourself as having given people the tools to help you.

Sep 13 '21 22:09 juliasilge

Hi Julia, I would like to know how to unfold the folds created with vfold_cv; for better inspection what samples are in each fold. Thanks

Oct 28 '21 01:10 data-datum

@data-datum You might find it helpful to use the tidy() method, or to check out this article on handling rset objects for examples on how to call analysis(). Or you can manually get the indices out; they are in in_id:

library(tidyverse)
library(rsample)

car_folds <- vfold_cv(mtcars, v = 3)
map(car_folds$splits, "in_id")
#> [[1]]
#>  [1]  1  2  3  5  9 11 12 14 15 16 17 18 21 22 23 24 25 26 27 31 32
#> 
#> [[2]]
#>  [1]  1  2  4  6  7  8  9 10 11 12 13 14 17 19 20 22 23 28 29 30 32
#> 
#> [[3]]
#>  [1]  3  4  5  6  7  8 10 13 15 16 18 19 20 21 24 25 26 27 28 29 30 31

^{Created on 2021-10-28 by the reprex package (v2.0.1)}

Oct 28 '21 15:10 juliasilge

I too have the same issue when using racing to tune a few models

race_ctrl <-
  control_race(
    save_pred = TRUE,
    parallel_over = "everything",
    save_workflow = TRUE,
    verbose = TRUE,
    pkgs = c('stringr')
  )

race_results <-
system.time(  
     all_workflows %>%
     workflow_map(
       "tune_race_anova",
       seed = 1503,
       resamples = vfolds,
       grid = 25,
       verbose = TRUE,
       control = race_ctrl
     )
 )
i 1 of 8 tuning:     pca_norm_recipe_RF
i Creating pre-processing data to finalize unknown parameter: mtry
*** recursive gc invocation
Warning: stack imbalance in 'lapply', 154 then 152
x 1 of 8 tuning:     pca_norm_recipe_RF failed with: There were no valid metrics for the ANOVA model.
i 2 of 8 tuning:     pca_norm_recipe_boosting

Only successful when i switch out the racing to the standard tune_grid

grid_ctrl <-
  control_grid(
    save_pred = TRUE,
    parallel_over = "everything",
    save_workflow = TRUE,
    pkgs = c('stringr')
  )

full_results_time <-
  system.time(
    grid_results <-
      all_workflows %>%
      workflow_map(
        seed = 1503,
        resamples = vfolds,
        grid = 25,
        control = grid_ctrl,
        verbose = TRUE
      )
  )

i 1 of 8 tuning:     pca_norm_recipe_RF
i Creating pre-processing data to finalize unknown parameter: mtry
v 1 of 8 tuning:     pca_norm_recipe_RF (21m 29.6s)
i 2 of 8 tuning:     pca_norm_recipe_boosting

Nov 10 '21 04:11 tsengj

Wow @tsengj I have not seen a garbage collection error from these functions. Can you create a reprex (a minimal reproducible example) for this and post it on the finetune repo? The goal of a reprex is to make it easier for us to recreate your problem so that we can understand it and/or fix it.

If you've never heard of a reprex before, you may want to start with the tidyverse.org help page. You may already have reprex installed (it comes with the tidyverse package), but if not you can install it with:

install.packages("reprex")

Thanks! 🙌

Nov 10 '21 17:11 juliasilge

@juliasilge Turns out that removing the line "pkgs = c('stringr')" from control_race fixed the error above. The stringr package was a simple step_mutate recipe which does "postcode = as.numeric(str_sub(suburb,-4,-1)". Excluding that from the recipe resolved the issue above. I haven't had the opportunity to raise a reprex in the finetune repo. Doesn't appear as though finetune supports loading of package yet. I utilise parallel processing (doParallel)

Nov 11 '21 12:11 tsengj

Hi Julia,

Thank you for your valued contributions! When I run the tune_race_anova function on a workflow containing an XGboost model I also get the following error: min_preproc_xgboost failed with: There were no valid metrics for the ANOVA model. All other models are ok. I've been able to run XGboost on the same machine using the approach below and it worked fine then. I have a hard time debugging this one, do you have any ideas at what might cause this error? I've made a reprex using the diamonds dataset and session info (hope it's done correctly as this is my first reprex).

Any help is much appreciated.

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
library(tidyverse)
library(here)
#> here() starts at /private/var/folders/pw/540tsbnx2r3gtmk605nm1fsc0000gn/T/RtmpTkNsqu/reprex-381939e26d6f-sand-viper
library(baguette)
library(rules)
#> 
#> Attaching package: 'rules'
#> The following object is masked from 'package:dials':
#> 
#>     max_rules
library(finetune)
library(dials)


options(tidymodels.dark = TRUE)
doParallel::registerDoParallel()

carat <- diamonds %>% 
  select(price, cut, carat, clarity)

## Build models

set.seed(123)
carat_split <- initial_split(carat, strata = price)
carat_train <- training(carat_split)
carat_test <- testing(carat_split)

set.seed(234)
carat_folds <- vfold_cv(carat_train, strata = price)
carat_folds
#> #  10-fold cross-validation using stratification 
#> # A tibble: 10 × 2
#>    splits               id    
#>    <list>               <chr> 
#>  1 <split [36405/4048]> Fold01
#>  2 <split [36406/4047]> Fold02
#>  3 <split [36407/4046]> Fold03
#>  4 <split [36408/4045]> Fold04
#>  5 <split [36408/4045]> Fold05
#>  6 <split [36408/4045]> Fold06
#>  7 <split [36408/4045]> Fold07
#>  8 <split [36409/4044]> Fold08
#>  9 <split [36409/4044]> Fold09
#> 10 <split [36409/4044]> Fold10

ranger_spec <-
  rand_forest(trees = 1e3, min_n = tune(), mtry = tune()) %>%
  set_engine("ranger") %>%
  set_mode("regression")

xgb_spec <- boost_tree(tree_depth = tune(), learn_rate = tune(), loss_reduction = tune(), 
                       min_n = tune(), sample_size = tune(), trees = tune()) %>% 
  set_engine("xgboost") %>% 
  set_mode("regression")


cubist_spec <- cubist_rules(committees = tune(), neighbors = tune()) %>% 
  set_engine("Cubist") %>% 
  set_mode("regression")

base_rec <- recipe(formula  = price ~ carat + cut + clarity,
                   data = carat_train) %>% 
  step_string2factor(cut, clarity)

min_pre_proc <- 
  workflow_set(
    preproc = list(min_preproc = base_rec), 
    models = list(RF = ranger_spec, xgboost = xgb_spec, Cubist = cubist_spec)
  )

## Evaluate models

race_ctrl <-
  control_race(
    save_pred = TRUE,
    parallel_over = "everything",
    save_workflow = TRUE
  )

race_results_carat <- 
  min_pre_proc %>% 
  workflow_map("tune_race_anova",
               seed = 1503, 
               resamples = carat_folds,
               grid = 25, 
               control = race_ctrl, 
               verbose = TRUE)
#> i 1 of 3 tuning:     min_preproc_RF
#> i Creating pre-processing data to finalize unknown parameter: mtry
#> ✓ 1 of 3 tuning:     min_preproc_RF (4m 25.9s)
#> i 2 of 3 tuning:     min_preproc_xgboost
#> Warning: All models failed. See the `.notes` column.
#> x 2 of 3 tuning:     min_preproc_xgboost failed with: There were no valid metrics for the ANOVA model.
#> i 3 of 3 tuning:     min_preproc_Cubist
#> ✓ 3 of 3 tuning:     min_preproc_Cubist (3m 46.4s)

^{Created on 2022-01-31 by the reprex package (v2.0.1)}

Session info

sessionInfo()
#> R version 4.1.2 (2021-11-01)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Monterey 12.1
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] nl_BE.UTF-8/nl_BE.UTF-8/nl_BE.UTF-8/C/nl_BE.UTF-8/nl_BE.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] Cubist_0.3.0       lattice_0.20-44    xgboost_1.5.0.2    ranger_0.13.1     
#>  [5] vctrs_0.3.8        rlang_0.4.12       finetune_0.1.0     rules_0.1.2       
#>  [9] baguette_0.1.1     here_1.0.1         forcats_0.5.1      stringr_1.4.0     
#> [13] readr_2.1.1        tidyverse_1.3.1    yardstick_0.0.9    workflowsets_0.1.0
#> [17] workflows_0.2.4    tune_0.1.6         tidyr_1.1.4        tibble_3.1.6      
#> [21] rsample_0.1.1      recipes_0.1.17     purrr_0.3.4        parsnip_0.1.7     
#> [25] modeldata_0.1.1    infer_1.0.0        ggplot2_3.3.5      dplyr_1.0.7       
#> [29] dials_0.0.10       scales_1.1.1       broom_0.7.11       tidymodels_0.1.4  
#> 
#> loaded via a namespace (and not attached):
#>  [1] minqa_1.2.4        colorspace_2.0-2   ellipsis_0.3.2     class_7.3-19      
#>  [5] rprojroot_2.0.2    fs_1.5.2           rstudioapi_0.13    listenv_0.8.0     
#>  [9] furrr_0.2.3        earth_5.3.1        mvtnorm_1.1-3      prodlim_2019.11.13
#> [13] fansi_1.0.2        lubridate_1.8.0    xml2_1.3.3         codetools_0.2-18  
#> [17] splines_4.1.2      doParallel_1.0.16  libcoin_1.0-9      knitr_1.37        
#> [21] Formula_1.2-4      jsonlite_1.7.3     nloptr_1.2.2.3     pROC_1.18.0       
#> [25] dbplyr_2.1.1       compiler_4.1.2     httr_1.4.2         backports_1.4.1   
#> [29] assertthat_0.2.1   Matrix_1.3-4       fastmap_1.1.0      cli_3.1.1         
#> [33] prettyunits_1.1.1  htmltools_0.5.2    tools_4.1.2        partykit_1.2-15   
#> [37] gtable_0.3.0       glue_1.6.0         reshape2_1.4.4     Rcpp_1.0.8        
#> [41] cellranger_1.1.0   DiceDesign_1.9     nlme_3.1-152       iterators_1.0.13  
#> [45] inum_1.0-4         timeDate_3043.102  gower_0.2.2        xfun_0.29         
#> [49] globals_0.14.0     lme4_1.1-27.1      rvest_1.0.2        lifecycle_1.0.1   
#> [53] future_1.23.0      MASS_7.3-54        ipred_0.9-12       hms_1.1.1         
#> [57] parallel_4.1.2     yaml_2.2.1         C50_0.1.5          TeachingDemos_2.12
#> [61] rpart_4.1-15       stringi_1.7.6      highr_0.9          plotrix_3.8-2     
#> [65] foreach_1.5.1      lhs_1.1.3          boot_1.3-28        hardhat_0.1.6     
#> [69] lava_1.6.10        pkgconfig_2.0.3    evaluate_0.14      tidyselect_1.1.1  
#> [73] parallelly_1.30.0  plyr_1.8.6         magrittr_2.0.1     R6_2.5.1          
#> [77] generics_0.1.1     DBI_1.1.2          pillar_1.6.4       haven_2.4.3       
#> [81] withr_2.4.3        survival_3.2-13    nnet_7.3-16        future.apply_1.8.1
#> [85] modelr_0.1.8       crayon_1.4.2       utf8_1.2.2         tzdb_0.2.0        
#> [89] rmarkdown_2.11     grid_4.1.2         readxl_1.3.1       data.table_1.14.2 
#> [93] plotmo_3.6.1       reprex_2.0.1       digest_0.6.29      GPfit_1.0-8       
#> [97] munsell_0.5.0

Jan 31 '22 14:01 wdkeyzer

@wdkeyzer xgboost models require only numeric predictors; they can't handle any predictors like diamonds$clarity or diamonds$cut. You can check out this appendix for more info on preprocessing needed for different models.

Also, if you ever run into trouble with a workflow set like this, I recommend trying to just plain fit the workflow on your training data, or use tune_grid(). You will likely get a better understanding of where the problems are.

Jan 31 '22 22:01 juliasilge

thank you @juliasilge for your help! I've came across the appendix before but didn't think about that... . Regarding plain fit and tune_grid(), that's a pro tip that should improve my problem solving in future. Thank you for pointing this out.

Feb 01 '22 20:02 wdkeyzer

Hi Julia, In the section where you describe "Let’s use last_fit() to fit one final time to the training data and evaluate one final time on the testing data." What in the code is demonstrating the model is being used on the test set? For example :

collect_predictions(xgb_last) %>% mn_log_loss(is_home_run, .pred_HR)

Mar 14 '22 18:03 pspangler1

@pspangler1 It's this code, where we use last_fit():

xgb_last <- xgb_wf %>%
  finalize_workflow(select_best(xgb_rs, "mn_log_loss")) %>%
  last_fit(bb_split)

If you look at the number of predictions that are coming out of collect_predictions(xgb_last) you'll notice it is the number of observations in the test set.

Mar 14 '22 18:03 juliasilge

Is there a way to also get predictions for the training set?

Aug 06 '22 01:08 cseibold47

@cseibold47 We recommend against repredicting the training set for most typical use cases but you can use predict() with a fitted model on any data, which could include the training set.

Aug 08 '22 02:08 juliasilge

Hi Julia, would you be able to tell me... in the tune_race_anova step.... I know you say it's doing ANOVA to determine which aren't likely to be winners... but is it comparing the models using roc_auc or mean_log_loss?

Oct 14 '22 04:10 jtag04

@jtag04 You can read more about this in the docs but the default is to use the first entry in the default metrics() for your model. You can instead specify a different metric to use via the metrics argument.

Oct 15 '22 03:10 juliasilge

Thanks Julia, that's a big help

On Sat, Oct 15, 2022 at 2:27 PM Julia Silge @.***> wrote:

@jtag04 https://github.com/jtag04 You can read more about this in the docs https://finetune.tidymodels.org/reference/tune_race_anova.html but the default is to use the first entry in the default metrics() for your model. You can instead specify a different metric to use via the metrics argument.

— Reply to this email directly, view it on GitHub https://github.com/juliasilge/juliasilge.com/issues/42#issuecomment-1279646194, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADH6BEO6QUJHKDOLUL4SSGTWDIQDJANCNFSM5BI2337A . You are receiving this because you were mentioned.Message ID: @.***>

Oct 15 '22 11:10 jtag04

Hi Julia, I think I've run into a bug using finetune::tune_sim_anneal() https://github.com/tidymodels/dials/issues/258 Is this something you've encountered before?

Oct 20 '22 10:10 jtag04

@jtag04 Hmmm, I haven't seen that before. Opening an issue was the right call, and it would be definitely helpful if you could create a reprex (a minimal reproducible example) for that issue. The goal of a reprex is to make it easier for people to recreate your problem so that they can understand it and/or fix it. If you've never heard of a reprex before, you may want to start with the tidyverse.org help page.

Oct 20 '22 15:10 juliasilge

Yeah, totally, creating a reprex is going to take a little bit of doing -as the model/workflow contains sensitive data. I'll totally give it a shot if I don't hear from Max Kuhn in the coming days. Was hoping I might get lucky and someone would recognise what was going on. Has got me miffed. Cheers

On Fri, Oct 21, 2022 at 2:31 AM Julia Silge @.***> wrote:

@jtag04 https://github.com/jtag04 Hmmm, I haven't seen that before. Opening an issue was the right call, and it would be definitely helpful if you could create a reprex https://reprex.tidyverse.org/ (a minimal reproducible example) for that issue. The goal of a reprex is to make it easier for people to recreate your problem so that they can understand it and/or fix it. If you've never heard of a reprex before, you may want to start with the tidyverse.org help https://www.tidyverse.org/help/ page.

— Reply to this email directly, view it on GitHub https://github.com/juliasilge/juliasilge.com/issues/42#issuecomment-1285752752, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADH6BEIPX7KWBXQO3GGSGYTWEFQWDANCNFSM5BI2337A . You are receiving this because you were mentioned.Message ID: @.***>

Oct 20 '22 20:10 jtag04

Hi Julia, Have added a reprex to that Dials package issue I've logged. Hopefully that's some help. Cheers, Julian

On Fri, Oct 21, 2022 at 7:55 AM Julian Tagell @.***> wrote:

Yeah, totally, creating a reprex is going to take a little bit of doing -as the model/workflow contains sensitive data. I'll totally give it a shot if I don't hear from Max Kuhn in the coming days. Was hoping I might get lucky and someone would recognise what was going on. Has got me miffed. Cheers

On Fri, Oct 21, 2022 at 2:31 AM Julia Silge @.***> wrote:

@jtag04 https://github.com/jtag04 Hmmm, I haven't seen that before. Opening an issue was the right call, and it would be definitely helpful if you could create a reprex https://reprex.tidyverse.org/ (a minimal reproducible example) for that issue. The goal of a reprex is to make it easier for people to recreate your problem so that they can understand it and/or fix it. If you've never heard of a reprex before, you may want to start with the tidyverse.org help https://www.tidyverse.org/help/ page.

— Reply to this email directly, view it on GitHub https://github.com/juliasilge/juliasilge.com/issues/42#issuecomment-1285752752, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADH6BEIPX7KWBXQO3GGSGYTWEFQWDANCNFSM5BI2337A . You are receiving this because you were mentioned.Message ID: @.***>

Oct 21 '22 22:10 jtag04

Hey @juliasilge, I do recognise that we're in "open-source world"... but is there any special way of getting some attention to that Dials issue I've raised?

Oct 25 '22 06:10 Tadge-Analytics

juliasilge.com juliasilge.com copied to clipboard

Use racing methods to tune xgboost models and predict home runs | Julia Silge

Use racing methods to tune xgboost models and predict home runs | Julia Silge

juliasilge.com
juliasilge.com copied to clipboard