[R] Fix xgb.cv() for AFT models
This PR fixes https://github.com/dmlc/xgboost/issues/7118
In order to keep the xgb.cv() API as it is:
datamust be anxgb.DMatrixobject- which contains the two infos 'label_lower_bound' and 'label_upper_bound'.
Automatic stratified splitting is deactivated with a warning. @david-cortes this is something we need to keep in mind for multioutput regressions as well.
Thanks for looking into this. Left a small comment.
Although I am also thinking: if it constructs a DMatrix directly from data and labels - wouldn't it also require just a few lines to allow such a transformation with bounds as function arguments?
We could add more arguments to xgb.cv(). I was actually thinking in the other direction: let the function work only with xgb.DMatrix input and remove the "label" argument. Why? The current signature of xgb.cv() is very incomplete. For instance there is no "weight" argument.
We could add more arguments to
xgb.cv(). I was actually thinking in the other direction: let the function work only with xgb.DMatrix input and remove the "label" argument. Why? The current signature of xgb.cv() is very incomplete. For instance there is no "weight" argument.
Yes, that makes sense too - then we wouldn't need to update things in two places if new parameters come out.
cc @hcho3 for aft.
There's now a function xgb.DMatrix.hasinfo which could be used here to check whether the DMatrix has a given field without involving data copies:
https://github.com/dmlc/xgboost/blob/ff3d82c006083c01f7e1d9796564d32844ea952c/R-package/R/xgb.DMatrix.R#L223
Closed in favour of #10031