tvm
tvm copied to clipboard
[Bug] xgboost version conflict
There are some changes in xgboost callbacks since xgboost 1.6.0. Some class and fucntion have been removed: https://github.com/dmlc/xgboost/pull/7280
TVM still uses xgboost old callback function which has been deprecated since xgboost 1.3.0, which require 1.1.0< xgboost < 1.6.0. This should be decleared in install doc and setup.py. If users install latest xgboost, they will get an error: https://github.com/apache/tvm/issues/11409
I can help upgrade callback function in xgboost cost model, or fix version conflict in doc and setup script. I wish to discuss with community members.
CC: @zxybazh
Thanks for reporting this issue, I can confirm the old style callback function is currently used in TVM AutoScheduler and MetaSchedule. Users could see warnings about that if they are using a xgboost version later than 1.3.0. There are several steps to take:
- [x] Publish an XGBoost version notice in doc & set up tutorial. #12050
- [x] Upgrade the callback function in MetaSchedule XGBoost Cost Model #12141
- [x] Upgrade the callback function in AutoScheduler XGBoost Cost Model #12144 #14036
- [ ] Update the document and tutorials.
Given XGBoost has completely deprecated the old style callback function, I think it's necessary to migrate to new callback function and have tests on that. It would be great if you could help send PRs to update documentation, tutorials & code library. On our side, we'll prioritize the upgrade and keep trace of the progress here.
Thanks for your reply. I'll be appreciated if I can contribute to TVM. You really give me a detailed steps. And I will work on TVM in the next few days, so Hi everyone in TVM community!
The first PR related to task 1: Publish an XGBoost version notice in doc & set up tutorial. https://github.com/apache/tvm/pull/12050 @zxybazh
Thanks for submitting the first PR! I just sent a couple comments, let me know if you have any questions!
Thanks for submitting the first PR! I just sent a couple comments, let me know if you have any questions!
I am working on upgrading the callback function in AutoScheduler XGBoost Cost Model. I noticed some codes are related with crossfold(cv), but I'm not clear when will the crossfold function be called? When the env.cvfolds will become true?
In latest xgboost code, it seems that crossfold is a dependent function. And deprecated train method will set cvfold=None.
https://github.com/dmlc/xgboost/blob/e7decb9775dae440fd829fd37d1f56c5bcedb138/python-package/xgboost/training.py#L74
Ignore this reply if you are not familiar with this part.
def callback(env):
"""internal function"""
if not state:
init(env)
bst = env.model
i = env.iteration
cvfolds = env.cvfolds
res_dict = {}
if i % skip_every == 1:
return
##### evaluation #####
if cvfolds is not None:
for feval in fevals:
tmp = aggcv([f.eval(i, feval) for f in cvfolds])
for k, mean, std in tmp:
res_dict[k] = [mean, std]
else:
for feval in fevals:
bst_eval = bst.eval_set(evals, i, feval)
res = [x.split(":") for x in bst_eval.split()]
for kv in res[1:]:
res_dict[kv[0]] = [float(kv[1])]
@Sunny-Island It looks like cvfolds is default to None when using the deprecated callback function in xgboost here.
#12144 The second PR.