parsnip icon indicating copy to clipboard operation
parsnip copied to clipboard

Fine tuning which variable to smooth in GAM

Open gundalav opened this issue 3 years ago • 1 comments

As far as I understand, currently we have to manually which variable to smooth. In the example below:

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
#> Warning: package 'broom' was built under R version 4.1.2
tidymodels_prefer()

gam_spec <- gen_additive_mod(select_features = tune()) %>% set_mode("regression")

gam_wflow <- 
  workflow() %>% 
  # smoothing must be specified here:
  add_model(gam_spec, formula = mpg ~ s(disp) + wt + gear) %>% 
  add_variables(predictors = c(everything()), outcomes = mpg)

set.seed(1)
car_folds <- bootstraps(mtcars, times = 5)

gam_res <- 
  gam_wflow %>% tune_grid(resamples = car_folds)

show_best(gam_res, metric = "rmse")

Especially this line:

add_model(gam_spec, formula = mpg ~ s(disp) + wt + gear)

We have to manually define that it's disc variable we need to smooth. Is there a way we can fine tune to make the model automatically determine which variable to smooth?

G.V.

gundalav avatar Jul 07 '22 05:07 gundalav

You can't tune it in the sense that you tag something with tune() but you could use the workflowsets package!

You'd make a workflow set from workflows that differ in the model formula, and tune that set via workflow_map():

library(tidymodels)
tidymodels_prefer()

gam_spec <- gen_additive_mod(select_features = tune()) %>%
   set_mode("regression")

gam_wflow <- 
   workflow() %>% 
   add_model(gam_spec) %>% 
   add_variables(predictors = c(everything()), outcomes = mpg)

candidate_formulas <- list(
   mpg ~ s(disp) + wt + gear,
   mpg ~ disp + s(wt) + gear,
   mpg ~ disp + wt + s(gear)
)

wflows <- map(candidate_formulas, function(candidate_formula) {
   gam_wflow %>% 
      update_model(gam_spec, formula = as.formula(candidate_formula))
})
names(wflows) <- c("disp", "wt", "gear")
wflows <- do.call(as_workflow_set, wflows)

set.seed(1)
car_folds <- bootstraps(mtcars, times = 5)

gam_res <- 
   wflows %>% workflow_map(fn = "tune_grid", resamples = car_folds)

More on workflow sets at https://workflowsets.tidymodels.org/

hfrick avatar Apr 27 '23 12:04 hfrick