xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

Feature Request: add timeout parameter to the .fit() method

Open fingoldo opened this issue 1 year ago • 6 comments

Adding the timeout parameter to the .fit() method, that should force the library to return best known solution found so far as soon as provided number of seconds since the start of training are passed, will allow to satisfy training SLAs, when a user has only a limited time budget to finish certain model training. Also, this will make possible fair comparison of different hyperparameters.

Reaching the timeout should have the same effect as reaching max iterations, maybe with additional warning and/or attribute set so that the training job's finishing reason is clear to the end user.

fingoldo avatar Aug 08 '24 09:08 fingoldo

Can you achieve this with a custom callback?

RAMitchell avatar Aug 08 '24 10:08 RAMitchell

I did not realize it can be used to solve this problem. If I, while using early stopping, return True from my custom callback to stop the training, will the best iteration be set correctly by xgboost, or there will be some training progress loss?

fingoldo avatar Aug 08 '24 16:08 fingoldo

Can you achieve this with a custom callback?

Just to connect these 2 conversations... that is what I suggested in the feature request opened in LightGBM at the same time: https://github.com/microsoft/LightGBM/issues/6596#issuecomment-2275855955

jameslamb avatar Aug 09 '24 03:08 jameslamb

Right, it seemed very natural to me to use direct timeout instead of (or along with) n_estimators, and I ideally would like to have an universal parameter for that (similar to n_estimators) in the major gradient boosting libraries. Most of the cases I'd say exact max number of trees is not important to the user, it's actually max time spent that matters. And some hyperparameters combinations can lead to vastly different runtimes even with the same n_estimators. The timeout parameter would solve this problem.

fingoldo avatar Aug 10 '24 05:08 fingoldo

Last but not least, imagine that aliens have attacked the Earth and we only have one minute to compute trajectories of their missiles with ML. If this feature request is approved, responsible person just sets timeout=60, we intervene, and survive.

fingoldo avatar Aug 10 '24 05:08 fingoldo

Last but not least, imagine that aliens have attacked the Earth and we only have one minute to compute trajectories of their missiles with ML. If this feature request is approved, responsible person just sets timeout=60, we intervene, and survive.

fingoldo avatar Aug 10 '24 05:08 fingoldo

You can do that with a Python thread or process with polling. Or a callback for xgboost that emits an exception. I don't think this will be part of a ML lib.

trivialfis avatar Feb 14 '25 07:02 trivialfis