xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

XGBoost is much slower on Ryzen 7 3700X than on Core i5-1135G7 (with the same performance rating)

Open ibobak opened this issue 2 years ago • 3 comments

I have XGBoost 2.0.0 installed on two machines:

  • one with 4-core processor Intel Core i5-1135G7
  • another - with 8-core processor AMD Ryzen 7 3700X Both CPUs have almost the same single thread rating (based on passmark website - see the links above), while multiple thread rating is more than twice better for Ryzen 7.

I am running the same code on the same data on both PCs. The code does hyperparameter search using Optuna, and it trains XGBoost model. Optuna measures the time for each single iteration, so that I could build a histogram of the time for model training, and this is what I see:

ksnip_20231018-144230

ksnip_20231018-144228

I was not a surprise for me that on both PCs Optuna could perform about 1500 operations: while there are twice more cores for Ryzen than for core i5, the speed of XGBoost training is twice slower, and as a result, we are getting the same number of iterations.

I tried to recompile XGBoost with different optimization flags under ryzen:

  • march=native, march=znver2
  • O3
  • flto
  • mavx2
  • mfma

But this all just doesn't help.

ibobak avatar Oct 18 '23 11:10 ibobak

There are some ad-hoc internal constants that can be tuned scattered around the code base. If you are interested in this, I can try to gather them into one place.

trivialfis avatar Oct 18 '23 12:10 trivialfis

Yes, I'd gladly change those constants and try to recompile, and I will report the results here.

ibobak avatar Oct 18 '23 12:10 ibobak

Please take a look at: https://github.com/dmlc/xgboost/pull/9694 tuning.h. Considering the AMD is known for stuffing large cache in their CPUs, you might want to increase some of the parameters.

trivialfis avatar Oct 19 '23 04:10 trivialfis