scikit-learn-intelex icon indicating copy to clipboard operation
scikit-learn-intelex copied to clipboard

[Question/Bug] Out of memory issues while building inference

Open tizycki opened this issue 4 years ago • 3 comments
trafficstars

Describe the bug Every time I try to convert my xgboost model (using daal4py.get_gbt_model_from_xgboost) I'm running out of memory. Moved training script from my local machine to GCP, kept increasing resources and finally gave up at n1-highmem-32 (204GB of RAM). Is there any way to reduce number of workers or... to be able to accomplish that task? I understand that my model is quite complex, but 204GB of RAM should be sufficient to perform that kind of task.

To Reproduce Work in progress

Expected behavior Being able to build daal4py inference of my xgboost model having limited resources (e.g. 102GB of RAM).

Output/Screenshots None

Environment: GCP n1-highmem-32 machine. daal4py installed using pip.

tizycki avatar Mar 30 '21 10:03 tizycki

Hi tizycki, We checked the memory consumption of the model converter by saving all the necessary information to a file and performing a clean conversion (without importing xgboost). The file size for a quite large model of ours came out a little less than 20MB. You can try to get the same file for your workload by doing model_xgb.dump_model(filename) and just checking its size. Several runs showed slightly higher than 2x (relative to the size of the saved model) RAM consumption throughout the conversion process, which is quite expected - we read the file and make a similar model in the oneDAL representation. Could you please share more details about your workload? Your dataset and xgboost training parameters would be good to start with

RukhovichIV avatar Mar 31 '21 08:03 RukhovichIV

@RukhovichIV My model saved using model_xgb.save_model in xgb format has 48.2MB, saved as pickle 91.6MB. Config of classifier is the following:

XGBClassifier(base_score=0.5, booster='gbtree',
              colsample_bylevel=0.4503841871781403, colsample_bynode=1,
              colsample_bytree=0.9195352964526833, eval_metric='logloss',
              gamma=8.168958221061441e-09, gpu_id=-1, importance_type='gain',
              interaction_constraints='', learning_rate=0.07356404539935663,
              max_delta_step=5, max_depth=91, min_child_weight=2, missing=nan,
              monotone_constraints='()', n_estimators=388, n_jobs=-1,
              num_parallel_tree=1, random_state=0,
              reg_alpha=0.00010376808625045426, reg_lambda=476.96194787286544,
              scale_pos_weight=1.3165669602830552, subsample=0.387658500562527,
              tree_method='approx', use_label_encoder=False,
              validate_parameters=1, verbosity=None)

I cannot share a model and dataset because of sensitive data, but I'll try to create synthetic dataset to recreate this issue.

tizycki avatar Apr 06 '21 09:04 tizycki

I can add as well that this daal4py inference works while training a classifier on a small sample of dataset. Therefore it's not environment issue.

tizycki avatar Apr 06 '21 09:04 tizycki

You can try reducing number of threads used for inference by setting daalinit import daal4py daal4py.daalinit(nthreads=2)

Good point to start would be to limit it by amount of physical cores on system

napetrov avatar Apr 28 '23 15:04 napetrov