ml-flask-api
ml-flask-api copied to clipboard
Add support for LightGBM's native Data Structure API
Training LightGBM models on weighted datasets requires the native Data Structure API. The template, however, implements only the Scikit-learn API (LGBMClassifier and LGBMRegressor) but not the native API (lighgbm.Dataset and lightgbm.Booster). Maybe there is an analogue solution for weighting Pandas dataframes. If not, support for LightGBM's Data Structure API would allow to weight datasets prior to model training.
I'm working on it. To clarify, you want to use the model produced with the LightGBM training API right? More info here: https://lightgbm.readthedocs.io/en/latest/Python-API.html#training-api
Yes. I want to train a model on a previously weighted LightGBM dataset, i.e. with a weight added for each instance:
training_data = lgb.Dataset(X, label=y, weight=w)
I currently don't know how to do this with Pandas and Scikit-learn in order to use your API with weighted data.
I've created a branch lopezco/lightgbm-support to code this. Later I'll create a pull request.
You can take a look. I need to create a wrapper class in model/lgbm.py and add the support in factory.py.
You can also help me in this branch if you have time.
Once ready I'll merge to master.