caret
caret copied to clipboard
xgboost model warning : `ntree_limit` is deprecated, use `iteration_range` instead
Running xgboost
model using caret
package gives following warning
WARNING: amalgamation/../src/c_api/c_api.cc:718:
ntree_limit
is deprecated, useiteration_range
instead.
Minimal, reproducible example:
library(caret)
#eXtreme Gradient Boosting
set.seed(123)
modelFit <- train(Species~., data=iris,
preProcess=c("center", "scale"),
method="xgbTree")
I have tried to use warning = FALSE
and message = FALSE
in the chunk setting. But still, it appears in the knit document. How to remove this warning?
Session Info:
>sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_India.1252 LC_CTYPE=English_India.1252
[3] LC_MONETARY=English_India.1252 LC_NUMERIC=C
[5] LC_TIME=English_India.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] caret_6.0-90 lattice_0.20-45 ggplot2_3.3.5
loaded via a namespace (and not attached):
[1] tidyselect_1.1.1 purrr_0.3.4 reshape2_1.4.4
[4] listenv_0.8.0 splines_4.1.2 colorspace_2.0-2
[7] vctrs_0.3.8 generics_0.1.1 stats4_4.1.2
[10] utf8_1.2.2 survival_3.2-13 prodlim_2019.11.13
[13] rlang_0.4.11 e1071_1.7-9 ModelMetrics_1.2.2.2
[16] pillar_1.6.4 glue_1.6.0 withr_2.4.3
[19] DBI_1.1.2 xgboost_1.5.0.2 foreach_1.5.1
[22] lifecycle_1.0.1 plyr_1.8.6 lava_1.6.10
[25] stringr_1.4.0 timeDate_3043.102 munsell_0.5.0
[28] gtable_0.3.0 future_1.23.0 recipes_0.1.17
[31] codetools_0.2-18 parallel_4.1.2 class_7.3-19
[34] fansi_0.5.0 Rcpp_1.0.7 scales_1.1.1
[37] ipred_0.9-12 jsonlite_1.7.2 parallelly_1.30.0
[40] digest_0.6.29 stringi_1.7.5 dplyr_1.0.7
[43] grid_4.1.2 tools_4.1.2 magrittr_2.0.1
[46] proxy_0.4-26 tibble_3.1.6 crayon_1.4.2
[49] future.apply_1.8.1 pkgconfig_2.0.3 ellipsis_0.3.2
[52] MASS_7.3-54 Matrix_1.3-4 data.table_1.14.2
[55] pROC_1.18.0 lubridate_1.8.0 gower_0.2.2
[58] assertthat_0.2.1 iterators_1.0.13 R6_2.5.1
[61] globals_0.14.0 rpart_4.1-15 nnet_7.3-16
[64] nlme_3.1-153 compiler_4.1.2
Hello, I also encountered this problem. How did you solve it?
You can follow this. Just add verbosity = 0
within train
function.
As suggested by missuse "The current warning means xgboost is changing the name of an argument, but caret is still supplying the old name. Currently it works but with new xgboost versions the argument will be completely replaced, if carets function code is not updated by then the warning will be replaced by an error." So, it would be better if carets function code is updated.
Agreed. I am teaching a class using caret and I think these warnings are confusing for students.
@topepo Any plans to change this please? Seems a simple one-liner?
Source of warning
The warning comes when using the predict()
-function with the ntreelimit
-parameter. See code chunk below,
https://github.com/topepo/caret/blob/5f4bd2069bf486ae92240979f9d65b5c138ca8d4/models/files/xgbDART.R#L164
So yes, @Jon77Ruler, this is an easy fix if ntreelimit
is changed with iteration_range
. I have posted a reprex
below to demonstrate the issue using {xgboost}.
However, I am not sure what the repository rules are for these kind of "simple" bug-fixes. @topepo wrote that {caret} is on the "backburner", see issue https://github.com/topepo/caret/issues/1365 - so it might be a while before we get a fix. Its not breaking "issue" yet, but it might be in the future.
Demonstration of problem and solution
library(xgboost)
data(
agaricus.train,
package = 'xgboost'
)
data(
agaricus.test,
package = 'xgboost'
)
# + estimate model
simple_model <- xgboost(
data =agaricus.train$data,
label = agaricus.train$label,nrounds = 2
)
#> [1] train-rmse:0.350593
#> [2] train-rmse:0.246082
# + in caret
first <- predict(
simple_model,
agaricus.test$data,
# in caret
ntreelimit = 2
)
#> [10:44:28] WARNING: src/c_api/c_api.cc:935: `ntree_limit` is deprecated, use `iteration_range` instead.
second <- predict(
simple_model,
agaricus.test$data,
# in xgboost
iteration_range = 2
)
setequal(
first,
second
)
#> [1] TRUE
Created on 2024-07-19 with reprex v2.1.0