LightGBM icon indicating copy to clipboard operation
LightGBM copied to clipboard

[Bug] LightGBMError: bin size 257 cannot run on GPU

Open rohan-gt opened this issue 4 years ago • 13 comments

I'm getting the following error while running the latest LightGBM GPU using these params:

params = {
            'device_type': 'gpu',
            'gpu_device_id': 0,
            'gpu_platform_id': 0,
            'gpu_use_dp': 'false',
            'max_bin': 255
}

on Google Colab using this Kaggle dataset: https://www.kaggle.com/c/ieee-fraud-detection. I'm dropping all the categorical variables:

LightGBMError: bin size 257 cannot run on GPU

rohan-gt avatar Aug 28 '20 05:08 rohan-gt

I also have the same problem. Is there any way to solve this problem?

hengzhe-zhang avatar Oct 22 '20 11:10 hengzhe-zhang

sorry for missing this issue. The max-bin actually cannot limit the number of bins for categorical feature. there are two workarounds:

  1. use the categorical encodings, converting categorical features to numerical ones.
  2. split one categorical feature to multi categorical features, and make sure the number of categories in each splitted feature smaller than 256.

guolinke avatar Oct 22 '20 13:10 guolinke

I hit this randomly for no reason with categorical_features as explicitly empty. Has nothing to do with that. The test that hit this normally has passed 1000 times before.

File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/sklearn.py", line 794, in fit
    categorical_feature=categorical_feature, callbacks=callbacks, init_model=init_model)
  File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/sklearn.py", line 637, in fit
    callbacks=callbacks, init_model=init_model)
  File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/engine.py", line 230, in train
    booster = Booster(params=params, train_set=train_set)
  File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/basic.py", line 2104, in __init__
    ctypes.byref(self.handle)))
  File "/opt/h2oai/dai/cuda-10.0/lib/python3.6/site-packages/lightgbm_gpu/basic.py", line 52, in _safe_call
    raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
lightgbm.basic.LightGBMError: bin size 257 cannot run on GPU

The number of bins was 255 and there are no categorical features as explicitly chosen.

pseudotensor avatar Mar 16 '21 04:03 pseudotensor

Same issue happened for me:

  File "/usr/local/lib/python3.6/dist-packages/lightgbm/engine.py", line 228, in train
    booster = Booster(params=params, train_set=train_set)
  File "/usr/local/lib/python3.6/dist-packages/lightgbm/basic.py", line 2237, in __init__
    ctypes.byref(self.handle)))
  File "/usr/local/lib/python3.6/dist-packages/lightgbm/basic.py", line 110, in _safe_call
    raise LightGBMError(_LIB.LGBM_GetLastError().decode('utf-8'))
lightgbm.basic.LightGBMError: bin size 257 cannot run on GPU

George3d6 avatar Jul 18 '21 00:07 George3d6

Some of the values are categorical in my case but not as many as 257 different ones, combined with @pseudotensor comment, I assume this is something else.

George3d6 avatar Jul 18 '21 00:07 George3d6

I am also getting the same error.

lightgbm.basic.LightGBMError: bin size 257 cannot run on GPU

lewis-morris avatar Jan 12 '22 08:01 lewis-morris

What causes this error?Is the bin_size of a categorical feature bigger than the max_bin that causes the error? Or it is because the memory is not enough. And the model can work on CPU. Thank you!

MAxx8371 avatar Feb 14 '22 03:02 MAxx8371

lightgbm.basic.LightGBMError: bin size 407 cannot run on GPU

jiluojiluo avatar Jul 11 '22 10:07 jiluojiluo

lightgbm.basic.LightGBMError: bin size 407 cannot run on GPU

this is a bug for lightGBM run on GPU,when use CPU,it is OK. SO ,LGBM on GPU need improve.

jiluojiluo avatar Jul 11 '22 10:07 jiluojiluo

Same error encountered, any update?

ChiHangChen avatar Jul 27 '22 05:07 ChiHangChen

Same here.

lightgbm.basic.LightGBMError: bin size 670 cannot run on GPU

aforadi avatar Aug 04 '22 06:08 aforadi

It seems that I've identified the cause of the error: The calculation method for num_total_bin used during Exclusive Feature Bundling https://github.com/microsoft/LightGBM/blob/665c47313d6938a8d80559c824073a142c8bd870/src/io/dataset.cpp#L134 doesn't align completely with the way num_total_bin is calculated during the creation of a FeatureGroup https://github.com/microsoft/LightGBM/blob/665c47313d6938a8d80559c824073a142c8bd870/include/LightGBM/feature_group.h#L68 As a result, the max_bin_per_group (=256) is working during Bundling, but it is not working when creating the FeatureGroup. When I replaced the GetDefaultBin() at dataset.cpp#L134 with GetMostFreqBin(), the issue was resolved. I had tested with the case reported here: https://github.com/microsoft/LightGBM/issues/4082

CVPaul avatar Aug 04 '23 07:08 CVPaul

Same issue here. Can we prioritize the fixing MR?

XQ-UT avatar Nov 28 '23 20:11 XQ-UT