onnxmltools icon indicating copy to clipboard operation
onnxmltools copied to clipboard

enable convert_lightgbm to output tensor type

Open huzq2016 opened this issue 3 years ago • 12 comments

Description When I used convert_lightgbm, I got a graph: lightgbm which includes a probabilities map type (for example, [{0: 0.9, 1: 0.1}, {0: 0.2, 1: 0.8}]) that is not supported by onnxruntime server so far (see: https://github.com/microsoft/onnxruntime/issues/2385)

As comparison, when I used convert_xgboost, I got a graph: xgboost which includes a probabilities tensor (ie, [0.1, 0.8]) that is supported by onnxruntime server.

Very probably the map type in convert_lightgbm is caused by https://github.com/onnx/onnxmltools/blob/master/onnxmltools/convert/lightgbm/operator_converters/LightGbm.py#L454. In convert_xgboost, it is https://github.com/onnx/onnxmltools/blob/master/onnxmltools/convert/xgboost/operator_converters/XGBoost.py#L291, which seemingly does not have zipmap.

Describe the solution you'd like Conversion of lightgbm into onnx models can output the tensor type of probabilities.

huzq2016 avatar Mar 11 '21 09:03 huzq2016

@xadupre , can you help with that?

wenbingl avatar Mar 11 '21 18:03 wenbingl

It is possible to remove zipmap operator by using options. The example shows how to use sklearn-onnx for lightgbm Convert a pipeline with a LightGBM model. The second one shows how to remove zipmap node One model, many possible conversions with options.

xadupre avatar Mar 11 '21 18:03 xadupre

thanks @xadupre. The example is only for sklearn-onnx. Wonder if a similar solution can be implemented for convert_lightgbm of onnxmltools?

huzq2016 avatar Mar 11 '21 21:03 huzq2016

It could be but I could probably choose a more simple way, just adding a parameter converrt_lightgbm(..., zipmap=True).

xadupre avatar Mar 11 '21 23:03 xadupre

@xadupre, actually we have tried sklearn-onnx, but always failed due to some issue. So it is better if we can implemented here.

huzq2016 avatar Mar 12 '21 02:03 huzq2016

Ok I have two bugs to fix then. I'll start by adding a parameter zipmap to convert_lightgbm.

xadupre avatar Mar 12 '21 08:03 xadupre

I started to work on this on PR #452.

xadupre avatar Mar 12 '21 09:03 xadupre

very appreciated for your kind help, @xadupre

huzq2016 avatar Mar 12 '21 09:03 huzq2016

@xadupre, sorry I forgot to mention that my model type is lightgbm.basic.Booster

The code snippet is as follows:

from lightgbm import Booster

# first used mmlspark.LightGBMClassifier to train a model, 
# then saved mode
# then used lightgbm.Booster to load model
bst = lightgbm.Booster(model_file='mmpspark_lightgbmclassifier_model.txt')  

print('The type of model: ', type(bst))
# The type of model:  <class 'lightgbm.basic.Booster'>

I found that here does not consider Booster type.

huzq2016 avatar Mar 15 '21 15:03 huzq2016

Hi, I am having the same issue. What is the solution for this? I trained the Lightgbm Binary Classifier using LightGBM.exe and the output probabilities type is 'SEQUENCE TYPE'. I am getting 'Allocation Failure' while inferencing because of this. Please let me know the solution. Thanks.

Bhuvanamitra avatar Aug 03 '21 05:08 Bhuvanamitra

A new version was released two days ago. Did you with this one?

xadupre avatar Aug 23 '21 12:08 xadupre

verified the following versions, and the fix works well:

  • lightgbm : 3.3.2
  • onnxmltools : 1.11.1
    onnx_model = convert_lightgbm(
        clf,
        initial_types=initial_type,
        zipmap=False)

i suppose we can close this issue. thanks @xadupre for this fix.

abgoswam avatar Jan 29 '23 23:01 abgoswam