m2cgen
m2cgen copied to clipboard
Support for LightGBM Booster and XGBoost Booster
We're training our LightGBM model outside of python (spark) so we need to load it from a model file before passing it to m2c. I don't believe LightGBM can load directly into LGBMRegressor
though, it must be loaded into lgb.Booster.
It would be nice if m2cgen supported lgb.Booster
Example
import lightgbm as lgb
import m2cgen as m2c
model = lgb.Booster(model_file='model.txt')
# this fails
# m2c.export_to_java(model)
# This works but is awkward
from lightgbm.sklearn import LGBMRegressor
r = LGBMRegressor()
r._Booster = model
code = m2c.export_to_java(r)
Hey @chris-smith-zocdoc, thanks for reporting this!
I think support for Booster
object is worth adding to m2cgen
. As part of this effort I'd also suggest to add a direct Booster
instance support for XGBoost
models as well.
Btw, PR is very welcome if you're up to it :)
I can give it a shot, can you point me to the appropriate files that would need changed?
Thanks, @chris-smith-zocdoc! You can begin with the following lines: https://github.com/BayesWitnesses/m2cgen/blob/master/m2cgen/assemblers/boosting.py#L139 - for LightGBM https://github.com/BayesWitnesses/m2cgen/blob/master/m2cgen/assemblers/boosting.py#L86 - for XGBoost.
This is where we're accessing the underlying Booster
instances from scikit-learn
compatible wrappers. I believe we can try and check what's being passed to us - a wrapper or a Booster
instance, and if it's a wrapper - retrieve the underlying Booster
instance from it.
classifier don't work
Also for LGBMRegressor I observe issues with operations type:
model_name = data_path + "LightGBM_model1.txt"
model = lgb.Booster(model_file=model_name)
from lightgbm.sklearn import LGBMRegressor
r = LGBMRegressor()
r._Booster = model
code = m2c.export_to_java(r)
results in
File " ../venv/lib/python3.8/site-packages/m2cgen/assemblers/boosting.py", line 318, in _assemble_tree
assert op == ast.CompOpType.LTE, "Unexpected comparison op"
AssertionError: Unexpected comparison op
I've debugged it, and saw EQ operation coming from the model
EQ operations seem to appear only if categorical_feature was specified in the training paameters.
Also, sadly no ranking objective support
File " ../venv/lib/python3.8/site-packages/m2cgen/assemblers/boosting.py", line 293, in _single_convert_output
raise ValueError(
ValueError: Unsupported objective function 'lambdarank'
@alexeymaksakov-tomtom
EQ operations seem to appear only if categorical_feature was specified in the training paameters.
Yeah, you are right. Categorical features are not supported yet, unfortunately. #102
when using m2cgen v0.9.0 convert pickle to js, we get the error msg below:
packages/m2cgen/assemblers/[__init__.py](http://__init__.py/)", line 141, in get_assembler_cls raise NotImplementedError(f"Model '{model_name}' is not supported") NotImplementedError: Model 'xgboost_Booster' is not supported
So,xgboost booster is not supported yet?
@tangdiforx
So,xgboost booster is not supported yet?
Unfortunately no, Booster
class is not supported yet.
@tangdiforx
So,xgboost booster is not supported yet?
Unfortunately no,
Booster
class is not supported yet.
Get it and thx for your reply. Do you plan to do it ?
Yeah, I do, but unfortunately without any ETA.
@StrikerRUS, there are plans to implement this functionality in 2023? 🙂