bonsai
bonsai copied to clipboard
Feature idea - Linking hyperparameters during CV
Problem
Within LightGBM, num_leaves
is capped at 2 ^ max_depth
. For example, if num_leaves
is set to 1000 and max_depth
is set to 5, then LightGBM will likely end up creating a full-depth tree with 32 (2 ^ 5) leaves per iteration.
{bonsai}
/ {parsnip}
have no knowledge of the relationship between these parameters. As a result, during cross-validation, Bayesian optimization and other CV search methods will spend a significant amount of time exploring meaningless hyperparameter space where num_leaves
> 2 ^ max_depth
. This results in longer CV times, especially for large models with many parameters.
Idea
One potential solution is to explicitly link num_leaves
and max_depth
specifically for the LightGBM model spec. I implemented this link in my treesnip fork by essentially adding two engine arguments:
-
link_max_depth
- Boolean. WhenFALSE
,max_depth
is equal to whatever is passed via engine/model arg. WhenTRUE
,max_depth
is equal to{floor(log2(num_leaves)) + link_max_depth_add
. -
link_max_depth_add
- Integer. Value added tomax_depth
. For example, iflink_max_depth
isTRUE
,num_leaves
is 1000, andlink_max_depth_add
is 2, thenmax_depth = floor(log2(1000)) + 2
, or 11.
This would improve cross-validation times by restricting the hyperparameter space that needs to be explored while leaving the default options untouched. Ideally, it could even be generalized (within {parsnip}
) to other model types that have intrinsically linked hyperparameters. However, not sure if this fits with the Tidymodels way of doing things. If it's totally out-of-scope, then feel free to close this issue.
Thanks for the issue! This seems worth looking into and also like it may have applications/need beyond this extension package—will chat about this with @topepo and get back to you sooner than later. :)
I think that the best way to handle this is to make methods for the grid_*()
functions for workflows and model specifications. That's really the only time that we could intercept the parameters and add a constraint (for a specific model).
We'll discuss this.
Appreciate the attention on this issue! Let me know if there's any way I can assist (debugging, testing, PR, etc.).
I'll add that the grid_*()
functions don't really have this issue, since you can manually filter or create a hyperparameter grid with such constraints built-in prior to CV. IMO this issue is more applicable to tune_bayes()
, since you can't filter/intercept the hyperparameters chosen by the sub-model.
That said, general methods for doing this linking/filtering with grid_*()
functions would still be incredibly useful.