caret icon indicating copy to clipboard operation
caret copied to clipboard

xgboost model warning : `ntree_limit` is deprecated, use `iteration_range` instead

Open bappa10085 opened this issue 3 years ago • 6 comments

Running xgboost model using caret package gives following warning

WARNING: amalgamation/../src/c_api/c_api.cc:718: ntree_limit is deprecated, use iteration_range instead.

Minimal, reproducible example:

library(caret)

#eXtreme Gradient Boosting
set.seed(123)
modelFit <- train(Species~., data=iris, 
                preProcess=c("center", "scale"), 
                method="xgbTree")

I have tried to use warning = FALSE and message = FALSE in the chunk setting. But still, it appears in the knit document. How to remove this warning?

Session Info:

>sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252   
[3] LC_MONETARY=English_India.1252 LC_NUMERIC=C                  
[5] LC_TIME=English_India.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] caret_6.0-90    lattice_0.20-45 ggplot2_3.3.5  

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.1     purrr_0.3.4          reshape2_1.4.4      
 [4] listenv_0.8.0        splines_4.1.2        colorspace_2.0-2    
 [7] vctrs_0.3.8          generics_0.1.1       stats4_4.1.2        
[10] utf8_1.2.2           survival_3.2-13      prodlim_2019.11.13  
[13] rlang_0.4.11         e1071_1.7-9          ModelMetrics_1.2.2.2
[16] pillar_1.6.4         glue_1.6.0           withr_2.4.3         
[19] DBI_1.1.2            xgboost_1.5.0.2      foreach_1.5.1       
[22] lifecycle_1.0.1      plyr_1.8.6           lava_1.6.10         
[25] stringr_1.4.0        timeDate_3043.102    munsell_0.5.0       
[28] gtable_0.3.0         future_1.23.0        recipes_0.1.17      
[31] codetools_0.2-18     parallel_4.1.2       class_7.3-19        
[34] fansi_0.5.0          Rcpp_1.0.7           scales_1.1.1        
[37] ipred_0.9-12         jsonlite_1.7.2       parallelly_1.30.0   
[40] digest_0.6.29        stringi_1.7.5        dplyr_1.0.7         
[43] grid_4.1.2           tools_4.1.2          magrittr_2.0.1      
[46] proxy_0.4-26         tibble_3.1.6         crayon_1.4.2        
[49] future.apply_1.8.1   pkgconfig_2.0.3      ellipsis_0.3.2      
[52] MASS_7.3-54          Matrix_1.3-4         data.table_1.14.2   
[55] pROC_1.18.0          lubridate_1.8.0      gower_0.2.2         
[58] assertthat_0.2.1     iterators_1.0.13     R6_2.5.1            
[61] globals_0.14.0       rpart_4.1-15         nnet_7.3-16         
[64] nlme_3.1-153         compiler_4.1.2

bappa10085 avatar Jan 10 '22 09:01 bappa10085

Hello, I also encountered this problem. How did you solve it?

Jack-make avatar Apr 15 '22 13:04 Jack-make

You can follow this. Just add verbosity = 0 within train function.

bappa10085 avatar Apr 15 '22 15:04 bappa10085

As suggested by missuse "The current warning means xgboost is changing the name of an argument, but caret is still supplying the old name. Currently it works but with new xgboost versions the argument will be completely replaced, if carets function code is not updated by then the warning will be replaced by an error." So, it would be better if carets function code is updated.

bappa10085 avatar Jul 07 '22 07:07 bappa10085

Agreed. I am teaching a class using caret and I think these warnings are confusing for students.

ifellows avatar Nov 19 '22 23:11 ifellows

@topepo Any plans to change this please? Seems a simple one-liner?

Jon77Ruler avatar Mar 18 '24 11:03 Jon77Ruler

Source of warning

The warning comes when using the predict()-function with the ntreelimit-parameter. See code chunk below,

https://github.com/topepo/caret/blob/5f4bd2069bf486ae92240979f9d65b5c138ca8d4/models/files/xgbDART.R#L164

So yes, @Jon77Ruler, this is an easy fix if ntreelimit is changed with iteration_range. I have posted a reprex below to demonstrate the issue using {xgboost}.

However, I am not sure what the repository rules are for these kind of "simple" bug-fixes. @topepo wrote that {caret} is on the "backburner", see issue https://github.com/topepo/caret/issues/1365 - so it might be a while before we get a fix. Its not breaking "issue" yet, but it might be in the future.

Demonstration of problem and solution

library(xgboost)
data(
  agaricus.train, 
  package = 'xgboost'
)

data(
  agaricus.test, 
  package = 'xgboost'
)

# + estimate model
simple_model <- xgboost(
  data =agaricus.train$data,
  label = agaricus.train$label,nrounds = 2
)
#> [1]  train-rmse:0.350593 
#> [2]  train-rmse:0.246082

# + in caret
first <- predict(
  simple_model,
  agaricus.test$data,
  # in caret
  ntreelimit = 2
)
#> [10:44:28] WARNING: src/c_api/c_api.cc:935: `ntree_limit` is deprecated, use `iteration_range` instead.
second <- predict(
  simple_model, 
  agaricus.test$data, 
  # in xgboost
  iteration_range = 2
)
setequal(
  first,
  second
)
#> [1] TRUE

Created on 2024-07-19 with reprex v2.1.0

serkor1 avatar Jul 19 '24 08:07 serkor1