xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

Training on GPU degrades F1 score

Open atomic opened this issue 1 year ago • 2 comments

Hi folks, we have produced multiple experiments that shows some indicators that the training on GPU on a very skewed dataset will produce worse metrics than a training on CPU. Both training are distributed using https://github.com/ray-project/xgboost_ray.

On test set: On CPU training, we are able to produces F1: ~0.15 and F2:~0.176 consistently. On GPU training, we are only able to produce up to F1: 0.135 and F2:0.166 throughout at best over multiple runs

XGBoost version: 1.5.2 (both CPU and GPU) Pre-processing steps: vectorize for dmatrix

atomic avatar Nov 08 '23 19:11 atomic

Hello atomic,

What version of XGBoost did you use for CPU and GPU?

What CPU and GPU hardware did you use? Can you list the tech specs?

Can you please describe the data preprocessing steps you took? Was this done the same for both CPU and GPU?

What hyperparameters did you use for CPU and GPU?

How did you do the train/test splits? Was randomization used for the batches?

  • Ty

tyrellto avatar Nov 14 '23 23:11 tyrellto

@tyrellto XGBoost version: 1.5.2, both for GPU and GPU

As for preprocessing, its done outside of XGBoost and is pre-training step that is similar to either CPU or GPU training, so i don’t think it has anything todo with the discrepancy.

The hyperparameter we use is:

{
  type: "classification"
  max_depth: 8
  min_child_weight: 1.0
  max_count: 1000
  l2_regularization: 1.0
  learn_rate: 0.10000000149011612
  base_score: 0.5
  column_sample_by_level: 1.0
  column_sample_by_tree: 1.0
  loss: "logloss"
  monotone_constraints: "(0)"
  subsample_ratio: 1.0
  tree_method: "gpu_hist"
  scale_pos_weight: 1.0
  optimization_objective: "binary:logistic"
}

Split is using 70% training 30% val.

atomic avatar Nov 30 '23 03:11 atomic