LightGBM
LightGBM copied to clipboard
Cross validation early stopping
Now cross validation early stopping happen based on mean
. But seems it's more correct to use minimum (worst) from all folds in iteration, if we want to choose num_iterations
based on best_iteration
for train model on complete dataset after cv.
https://sites.google.com/site/lauraeppx/xgboost/cross-validation also seems @Laurae2 tell here about it
For example, if on 3 folds cv we got accuracy on iteration 35) 0.9, 0.9, 0, mean = 0.6 29) 0.59, 0.58, 0.57, mean 0.58 - seems iteration 29 is better to choose for num_iterations
train model on complete set, but mean on 35 is better.
Is any way to change lgbm.cv
from mean to min mode? Or only my own cv with usual lgbm.train
calls?
Also if make my own - does lgbm.cv
have performance benefits than call several time lgbm.train
that I can use? It load data 1 time or several?
I don't think so. It's possible that you have a fold whose error is monotonically decreasing but still higher than other folds whereas other folds do have their minimums in early rounds . Then choosing the worst error will always set the best iter to the total number of iters.