h2o-llmstudio icon indicating copy to clipboard operation
h2o-llmstudio copied to clipboard

[FEATURE] Support for grid search over hyperparameters

Open tmostak opened this issue 2 years ago • 0 comments

🚀 Feature

Add the capability to the UI to kick off a grid-search over a set of hyperparameters (with specified search increments for continuous parameters, and specified attributes for categorical parameters), which would search for the combination of hyperparameters that yielded the best validation result.

Motivation

Currently attempting to find the best hyperparameters for model training requires manual launching of many experiments testing the various hyperparameter settings. It would be useful if the UX supported kicking off a "meta-experiment" which would systematically sweep a specified set of hyperparameters, allowing the user to specify the step increment for continuous parameters and allowed categorical attributes (i.e. model backbone) for categorical.

I would assume simple grid search would be a good start, but if possible the hyperparameter search feature could support other search algorithms as desired.

To save space, I assume that the user should be able to specify that the sub-models not actually be saved after training, or perhaps only the best model be saved.

Ideally the different experiments could be grouped into a single meta-experiment for easy comparison of the impact of hyperparameter changes on model accuracy, but if that's difficult they could just be output as individual experiments (would just take more work on the part of the user to manually sift through the results, but at least they wouldn't have to orchestrate the process of kicking off all the experiments).

Also ideally:

  1. Parameter steps could be specified as linear increments or as increments by a power, i.e. LORA rank could be incremented by powers of two.
  1. Certain hyperparameters could be specified as dependencies on hyperparameters being searched, i.e. LORA alpha could be specified as always 2X the rank.

I know this is probably a massive feature and may not fit with the goals of this project, but I think it would provide a lot of power for building the best models possible in the most automated way possible.

tmostak avatar Jul 05 '23 16:07 tmostak