xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

Suppressing waring message 'WARNING: amalgamation/../src/learner.cc:438'

Open huangwb8 opened this issue 1 year ago • 3 comments

Recently, I used xgboost v1.6.0.1 in R. When I use a model trained to in old xgboost version to predict, the log comes out:

WARNING: amalgamation/../src/learner.cc:438: 
  If you are loading a serialized model (like pickle in Python, RDS in R) generated by
  older XGBoost, please export the model by calling `Booster.save_model` from that version
  first, then load it back in current version. See:

    https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html

  for more details about differences between saving model and serializing.

This warning message could not be suppressed by setting verbose=F or verbose=0

pred <- predict(object, newdata, verbose=0)

After reading introduction in https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html, I still have no idea how to deal with the annoying log. Any suggestions?

huangwb8 avatar Sep 16 '22 05:09 huangwb8

By the way, the code and the model works. It's just the problem about the annoying logs.

huangwb8 avatar Sep 16 '22 05:09 huangwb8

I used codes like and it doen't work on suppressing the warning message:

xgb.save(old_bst, 'xgb.model')
new_bst <- xgb.load('xgb.model')
if (file.exists('xgb.model')){
  file.remove('xgb.model')
  pred <- predict(new_bst, newdata)
}

Any suggestions?

huangwb8 avatar Sep 16 '22 05:09 huangwb8

This seems like an issue for detecting old model. Let me take a deeper look tomorrow.

trivialfis avatar Sep 18 '22 17:09 trivialfis

Hi, do you mind if you share your model (possibly in private)? I couldn't reproduce it by using models from 1.0.0.

trivialfis avatar Oct 05 '22 00:10 trivialfis

@trivialfis OK!

Here is the Github repositry of my book, where I use xgboost as a method to solve classification problems.

I used some codes based on the GSClassifier package I developed.

# Package
# Install "devtools" package
if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")
# Install dependencies
if (!requireNamespace("luckyBase", quietly = TRUE))
  devtools::install_github("huangwb8/luckyBase")
# Install the "GSClassifier" package
if (!requireNamespace("GSClassifier", quietly = TRUE))
  devtools::install_github("huangwb8/GSClassifier")
# Load needed packages
library(GSClassifier)


# Data
testData <- readRDS(system.file("extdata", "testData.rds", package = "GSClassifier"))
M <- readRDS(system.file("extdata", "PAD.train_20220916.rds", package = "GSClassifier")) 

# Test data and model
X <- testData$PanSTAD_expr_part
X_bined <- GSClassifier:::trainDataProc_X(X[,1:10], geneSet = M$geneSet)
X_bined_matrix <- X_bined$dat$Xbin
model <- M$ens$Model[[1]]$`2`
X_bined_matrix2 <- X_bined_matrix[,model$genes]

# Prediction
pred <- predict(model$bst, X_bined_matrix2)

Here is the actual output containing the warning message about xgboost version via RMarkdown of RStudio/Posit:

image

Then, I used model$bst to call an xgboost model:

print(model$bst)

The output is like:

##### xgb.Booster
raw: 60.3 Kb 
call:
  xgb.train(params = params, data = dtrain, nrounds = nrounds, 
    watchlist = watchlist, verbose = verbose, print_every_n = print_every_n, 
    early_stopping_rounds = early_stopping_rounds, maximize = maximize, 
    save_period = save_period, save_name = save_name, xgb_model = xgb_model, 
    callbacks = callbacks, max_depth = ..1, eta = ..2, nthread = ..3, 
    objective = "binary:logistic")
params (as set within xgb.train):
  max_depth = "10", eta = "0.5", nthread = "10", objective = "binary:logistic", validate_parameters = "TRUE"
xgb.attributes:
  niter
callbacks:
  cb.print.evaluation(period = print_every_n)
  cb.evaluation.log()
# of features: 352 
niter: 15
nfeatures : 352 
evaluation_log:
    iter train_logloss
       1      0.400678
       2      0.267733
---                   
      14      0.031730
      15      0.029224

Finally, the information about the current R session is :

R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 
[2] LC_CTYPE=Chinese (Simplified)_China.936   
[3] LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C                              
[5] LC_TIME=Chinese (Simplified)_China.936    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GSClassifier_0.1.25 luckyBase_0.1.0    

loaded via a namespace (and not attached):
  [1] colorspace_2.0-3     ggsignif_0.6.3       rjson_0.2.21        
  [4] ellipsis_0.3.2       class_7.3-20         rprojroot_2.0.3     
  [7] circlize_0.4.15      GlobalOptions_0.1.2  fs_1.5.2            
 [10] clue_0.3-57          ggpubr_0.4.0         listenv_0.8.0       
 [13] remotes_2.4.2        prodlim_2019.11.13   fansi_1.0.3         
 [16] lubridate_1.8.0      codetools_0.2-18     splines_4.0.3       
 [19] doParallel_1.0.17    cachem_1.0.6         knitr_1.30          
 [22] pkgload_1.2.4        jsonlite_1.8.0       pROC_1.18.0         
 [25] caret_6.0-92         broom_1.0.0          cluster_2.1.3       
 [28] png_0.1-7            compiler_4.0.3       backports_1.4.1     
 [31] assertthat_0.2.1     Matrix_1.4-1         fastmap_1.1.0       
 [34] cli_3.3.0            htmltools_0.5.2      prettyunits_1.1.1   
 [37] tools_4.0.3          gtable_0.3.0         glue_1.6.2          
 [40] reshape2_1.4.4       dplyr_1.0.9          Rcpp_1.0.8.3        
 [43] carData_3.0-5        vctrs_0.4.1          nlme_3.1-149        
 [46] iterators_1.0.14     timeDate_3043.102    gower_1.0.0         
 [49] xfun_0.33            stringr_1.4.0        globals_0.15.1      
 [52] ps_1.4.0             testthat_3.1.0       lifecycle_1.0.1     
 [55] devtools_2.4.3       rstatix_0.7.0        future_1.26.1       
 [58] MASS_7.3-53          scales_1.2.0         ipred_0.9-12        
 [61] parallel_4.0.3       RColorBrewer_1.1-3   ComplexHeatmap_2.4.3
 [64] yaml_2.3.5           memoise_2.0.1        ggplot2_3.3.6       
 [67] rpart_4.1.16         stringi_1.7.6        desc_1.4.1          
 [70] randomForest_4.6-14  foreach_1.5.2        hardhat_1.1.0       
 [73] pkgbuild_1.3.1       lava_1.6.10          shape_1.4.6         
 [76] tuneR_1.4.0          rlang_1.0.2          pkgconfig_2.0.3     
 [79] evaluate_0.15        lattice_0.20-41      purrr_0.3.4         
 [82] recipes_0.2.0        processx_3.7.0       tidyselect_1.1.2    
 [85] parallelly_1.32.0    plyr_1.8.7           magrittr_2.0.3      
 [88] bookdown_0.21        R6_2.5.1             generics_0.1.2      
 [91] DBI_1.1.3            pillar_1.7.0         withr_2.5.0         
 [94] abind_1.4-5          survival_3.3-1       nnet_7.3-17         
 [97] tibble_3.1.7         future.apply_1.9.0   crayon_1.5.1        
[100] car_3.1-0            xgboost_1.6.0.1      utf8_1.2.2          
[103] rmarkdown_2.14       GetoptLong_1.0.5     usethis_2.1.3       
[106] grid_4.0.3           data.table_1.14.2    callr_3.7.0         
[109] ModelMetrics_1.2.2.2 digest_0.6.29        tidyr_1.2.0         
[112] signal_0.7-7         stats4_4.0.3         munsell_0.5.0       
[115] sessioninfo_1.2.2 

I do hope it can help for your debugging. Thank you very much!

huangwb8 avatar Oct 05 '22 00:10 huangwb8

The model training process is like:

cvRes <- xgb.cv(data = dtrain,
                      nrounds=params$nrounds,
                      nthread=params$nthread,
                      nfold=params$nfold,
                      max_depth=params$max_depth,
                      eta=params$eta,
                      early_stopping_rounds=2,
                      metrics = list("logloss", "auc"),
                      objective = "binary:logistic",
                      verbose = verbose)

bst <- xgboost(data = Xbin,
                 label = Ybin,
                 max_depth=params$max_depth,
                 eta=params$eta,
                 nrounds = cvRes$best_iteration,
                 nthread=params$nthread,
                 objective = "binary:logistic",
                 verbose = ifelse(verbose,1,0))

Any suggestions?

huangwb8 avatar Oct 05 '22 01:10 huangwb8

Hi, I appreciate the detailed information, but could you please try to extract a simpler code snippet that I can copy and run? I tried to install your package GSClassifier and ran into version conflicts after debugging issues with other package installations:

Warning message:
package ‘ComplexHeatmap’ is not available for this version of R

It would be great if there's a simpler way that I can get started.

trivialfis avatar Oct 09 '22 08:10 trivialfis

@trivialfis Thanks for your attention!

First, you can download files testData.rds and PAD.train_20220916.rds here: https://github.com/huangwb8/GSClassifier/tree/master/inst/extdata

Second, you can just use codes like:

testData <- readRDS('E:/RCloud/RFactory/GSClassifier/inst/extdata/testData.rds')
M <- readRDS('E:/RCloud/RFactory/GSClassifier/inst/extdata/PAD.train_20220916.rds') 

# Test data and model
X <- testData$PanSTAD_expr_part
X_bined <- GSClassifier:::trainDataProc_X(X[,1:10], geneSet = M$geneSet)
X_bined_matrix <- X_bined$dat$Xbin
model <- M$ens$Model[[1]]$`2`
X_bined_matrix2 <- X_bined_matrix[,model$genes]

# Prediction
pred <- predict(model$bst, X_bined_matrix2)

Replace paths of the readRDS function with your local path of these files before runing the above codes.

I do hope it can help!

huangwb8 avatar Oct 09 '22 10:10 huangwb8

I'm not sure why https://github.com/dmlc/xgboost/issues/8248#issuecomment-1248940061 doesn't work on your environment.

I tried both the master branch and the 1.6 release (from cran and build from source) and ran this (your script with an extra for loop):

library(xgboost)
testData <- readRDS('./testData.rds')
M <- readRDS('./PAD.train_20220916.rds')

# Test data and model
X <- testData$PanSTAD_expr_part
X_bined <- GSClassifier:::trainDataProc_X(X[,1:10], geneSet = M$geneSet)
X_bined_matrix <- X_bined$dat$Xbin
model <- M$ens$Model[[1]]$`2`
X_bined_matrix2 <- X_bined_matrix[,model$genes]

print("save")
xgb.save(model$bst, "saved-old")

print("load")
loaded <- xgb.load("saved-old")
## model$bst
# Prediction
print("prediction")
for (i in seq(0, 4)) {
  print(i)
  pred <- predict(loaded, X_bined_matrix2)
}

and here is the output, the warning is only generated during the call to xgb.save and the prediction is silent.

[1] "save"
[16:50:24] WARNING: amalgamation/../src/learner.cc:1040: 
  If you are loading a serialized model (like pickle in Python, RDS in R) generated by
  older XGBoost, please export the model by calling `Booster.save_model` from that version
  first, then load it back in current version. See:

    https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html

  for more details about differences between saving model and serializing.

[16:50:24] WARNING: amalgamation/../src/learner.cc:749: Found JSON model saved before XGBoost 1.6, please save the model using current version again. The support for old JSON model will be discontinued in XGBoost 2.3.
[16:50:24] WARNING: amalgamation/../src/learner.cc:438: 
  If you are loading a serialized model (like pickle in Python, RDS in R) generated by
  older XGBoost, please export the model by calling `Booster.save_model` from that version
  first, then load it back in current version. See:

    https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html

  for more details about differences between saving model and serializing.

[1] TRUE
[1] "load"
[1] "prediction"
[1] 0
[1] 1
[1] 2
[1] 3
[1] 4

trivialfis avatar Oct 10 '22 08:10 trivialfis

Yep, that's the problem!

The code above was just a simple example.

During real model training and subtype calling via GSClassifier, similar process would be done lots of time, which means lots of similar Warning, which are redundant and annoying.

Well, the model actually works. I just want to hide warning messages during the usage of xgboost, if possible. However, neither verbose=F nor verbose=0 could help.

Any suggestions?

huangwb8 avatar Oct 10 '22 09:10 huangwb8

You mentioned that you tried to save and load the model using the latest xgboost https://github.com/dmlc/xgboost/issues/8248#issuecomment-1248940061 . Can you replace the old model with the new one?

trivialfis avatar Oct 10 '22 09:10 trivialfis

@trivialfis Thanks for your suggestions! I think it is the best solution. The models should updated to the latest version to avoid warning.

huangwb8 avatar Oct 10 '22 09:10 huangwb8

you are welcome!

trivialfis avatar Oct 10 '22 09:10 trivialfis