xgboost
xgboost copied to clipboard
Feature Request: add timeout parameter to the .fit() method
Adding the timeout parameter to the .fit() method, that should force the library to return best known solution found so far as soon as provided number of seconds since the start of training are passed, will allow to satisfy training SLAs, when a user has only a limited time budget to finish certain model training. Also, this will make possible fair comparison of different hyperparameters.
Reaching the timeout should have the same effect as reaching max iterations, maybe with additional warning and/or attribute set so that the training job's finishing reason is clear to the end user.
Can you achieve this with a custom callback?
I did not realize it can be used to solve this problem. If I, while using early stopping, return True from my custom callback to stop the training, will the best iteration be set correctly by xgboost, or there will be some training progress loss?
Can you achieve this with a custom callback?
Just to connect these 2 conversations... that is what I suggested in the feature request opened in LightGBM at the same time: https://github.com/microsoft/LightGBM/issues/6596#issuecomment-2275855955
Right, it seemed very natural to me to use direct timeout instead of (or along with) n_estimators, and I ideally would like to have an universal parameter for that (similar to n_estimators) in the major gradient boosting libraries. Most of the cases I'd say exact max number of trees is not important to the user, it's actually max time spent that matters. And some hyperparameters combinations can lead to vastly different runtimes even with the same n_estimators. The timeout parameter would solve this problem.
Last but not least, imagine that aliens have attacked the Earth and we only have one minute to compute trajectories of their missiles with ML. If this feature request is approved, responsible person just sets timeout=60, we intervene, and survive.
Last but not least, imagine that aliens have attacked the Earth and we only have one minute to compute trajectories of their missiles with ML. If this feature request is approved, responsible person just sets timeout=60, we intervene, and survive.
You can do that with a Python thread or process with polling. Or a callback for xgboost that emits an exception. I don't think this will be part of a ML lib.