m2cgen icon indicating copy to clipboard operation
m2cgen copied to clipboard

Support for LightGBM Booster and XGBoost Booster

Open chris-smith-zocdoc opened this issue 4 years ago • 13 comments

We're training our LightGBM model outside of python (spark) so we need to load it from a model file before passing it to m2c. I don't believe LightGBM can load directly into LGBMRegressor though, it must be loaded into lgb.Booster.

It would be nice if m2cgen supported lgb.Booster

Example

import lightgbm as lgb
import m2cgen as m2c

model = lgb.Booster(model_file='model.txt')

# this fails
# m2c.export_to_java(model)

# This works but is awkward 
from lightgbm.sklearn import LGBMRegressor
r = LGBMRegressor()
r._Booster = model

code = m2c.export_to_java(r)

chris-smith-zocdoc avatar Sep 27 '19 14:09 chris-smith-zocdoc

Hey @chris-smith-zocdoc, thanks for reporting this!

I think support for Booster object is worth adding to m2cgen. As part of this effort I'd also suggest to add a direct Booster instance support for XGBoost models as well.

Btw, PR is very welcome if you're up to it :)

izeigerman avatar Sep 28 '19 22:09 izeigerman

I can give it a shot, can you point me to the appropriate files that would need changed?

chris-smith-zocdoc avatar Sep 29 '19 01:09 chris-smith-zocdoc

Thanks, @chris-smith-zocdoc! You can begin with the following lines: https://github.com/BayesWitnesses/m2cgen/blob/master/m2cgen/assemblers/boosting.py#L139 - for LightGBM https://github.com/BayesWitnesses/m2cgen/blob/master/m2cgen/assemblers/boosting.py#L86 - for XGBoost.

This is where we're accessing the underlying Booster instances from scikit-learn compatible wrappers. I believe we can try and check what's being passed to us - a wrapper or a Booster instance, and if it's a wrapper - retrieve the underlying Booster instance from it.

izeigerman avatar Sep 30 '19 15:09 izeigerman

classifier don't work

yuanjie-ai avatar May 20 '20 06:05 yuanjie-ai

Also for LGBMRegressor I observe issues with operations type:

model_name = data_path + "LightGBM_model1.txt"
model = lgb.Booster(model_file=model_name)

from lightgbm.sklearn import LGBMRegressor
r = LGBMRegressor()
r._Booster = model

code = m2c.export_to_java(r)

results in

  File  " ../venv/lib/python3.8/site-packages/m2cgen/assemblers/boosting.py", line 318, in _assemble_tree
    assert op == ast.CompOpType.LTE, "Unexpected comparison op"
AssertionError: Unexpected comparison op

I've debugged it, and saw EQ operation coming from the model

alexeymaksakov-tomtom avatar Aug 03 '21 11:08 alexeymaksakov-tomtom

EQ operations seem to appear only if categorical_feature was specified in the training paameters.

alexeymaksakov-tomtom avatar Aug 03 '21 11:08 alexeymaksakov-tomtom

Also, sadly no ranking objective support

 File " ../venv/lib/python3.8/site-packages/m2cgen/assemblers/boosting.py", line 293, in _single_convert_output
    raise ValueError(
ValueError: Unsupported objective function 'lambdarank'

alexeymaksakov-tomtom avatar Aug 03 '21 11:08 alexeymaksakov-tomtom

@alexeymaksakov-tomtom

EQ operations seem to appear only if categorical_feature was specified in the training paameters.

Yeah, you are right. Categorical features are not supported yet, unfortunately. #102

StrikerRUS avatar Aug 03 '21 21:08 StrikerRUS

when using m2cgen v0.9.0 convert pickle to js, we get the error msg below: packages/m2cgen/assemblers/[__init__.py](http://__init__.py/)", line 141, in get_assembler_cls raise NotImplementedError(f"Model '{model_name}' is not supported") NotImplementedError: Model 'xgboost_Booster' is not supported

So,xgboost booster is not supported yet?

tangdiforx avatar Feb 10 '22 08:02 tangdiforx

@tangdiforx

So,xgboost booster is not supported yet?

Unfortunately no, Booster class is not supported yet.

StrikerRUS avatar Feb 11 '22 01:02 StrikerRUS

@tangdiforx

So,xgboost booster is not supported yet?

Unfortunately no, Booster class is not supported yet.

Get it and thx for your reply. Do you plan to do it ?

tangdiforx avatar Feb 11 '22 04:02 tangdiforx

Yeah, I do, but unfortunately without any ETA.

StrikerRUS avatar Feb 12 '22 01:02 StrikerRUS

@StrikerRUS, there are plans to implement this functionality in 2023? 🙂

mirecl avatar Feb 13 '23 13:02 mirecl