hummingbird icon indicating copy to clipboard operation
hummingbird copied to clipboard

Random forest in LightGBM

Open arfangeta opened this issue 4 years ago • 12 comments

I want to clarify, now hummingbird is no support random forest in LightGBM? Is it planned?

When I convert from lgbm to onnx this model, I get an error lgb.LGBMClassifier(boosting_type='rf', n_estimators = 128, max_depth = 5, subsample = 0.3, bagging_freq = 1)

File "/venv/lib/python3.6/site-packages/hummingbird/ml/parse.py", line 242, in this_operator.inputs = [scope.variables[in] for in_ in input_names] KeyError: 'col_index'

arfangeta avatar Aug 17 '20 13:08 arfangeta

Thanks @arfangeta for reporting this. Yes this is is supposed to work. I will look at it.

interesaaat avatar Aug 17 '20 16:08 interesaaat

Hi @arfangeta, how are you calling the converter? I added a test for you code above and it works (#238).

interesaaat avatar Aug 17 '20 17:08 interesaaat

import lightgbm as lgb
import numpy as np
from onnxmltools import convert_lightgbm
from onnxconverter_common.data_types import FloatTensorType
from hummingbird.ml import convert
import onnxruntime as ort


if __name__ == '__main__':
    X = np.random.rand(100, 200)
    X = np.array(X, dtype=np.float32)
    y = np.random.randint(2, size=100)
    model = lgb.LGBMClassifier(boosting_type='rf', n_estimators=128, max_depth=5, bagging_freq = 1, subsample=0.3)
    model.fit(X, y)

    initial_types = [("input", FloatTensorType([X.shape[0], X.shape[1]]))]  # Define the inputs for the ONNX
    onnx_ml_model = convert_lightgbm(model, initial_types=initial_types, target_opset=9)
    onnx_model = convert(onnx_ml_model, "onnx", X)

arfangeta avatar Aug 18 '20 02:08 arfangeta

I see, you were converting to onnx. This should work too, let me look at why is returning the error.

interesaaat avatar Aug 18 '20 04:08 interesaaat

I am able to reproduce this error. I am working on a fix.

interesaaat avatar Aug 18 '20 19:08 interesaaat

Ok at the end the error on master was different. I have a branch here showing my progresses. Unfortunately I am blocked on onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for the node ArgMax_56:ArgMax(11). I will ask for help to the ONNX folks.

In the meantime I have 2 suggestions to unblock you:

  • please pull from master, I think you were using an older version of HB (we will soon upload v0.0.5 on pypy)
  • you can directly convert lgbm to onnx without using the onnxmltools converter, as here. This should work while we fix the problem with ArgMax.

interesaaat avatar Aug 18 '20 22:08 interesaaat

The ArgMax problem was fixed in this PR. Will wait for ORT 1.5 before closing this issue.

interesaaat avatar Aug 26 '20 01:08 interesaaat

ORT 1.5 has been out for a few months now, so I revisited this. It's available in pypi

I pulled from main in a fresh container with pip install -e .[onnx], and i still get ORT==1.4.0.

Collecting onnxruntime>=1.0.0; extra == "onnx" (from hummingbird-ml==0.1.0)
  Downloading https://files.pythonhosted.org/packages/52/99/b6618fbf5c9dde961bc4d555ce924f0a9cf0d12b3945b7a328c1b9592d11/onnxruntime-1.4.0-cp37-cp37m-manylinux2010_x86_64.whl (4.4MB)

In setup.py, we have "onnxruntime>=1.0.0", so I'm a little surprised I didn't get the latest in Pypi. Should we pin to "onnxruntime>=1.5.0" @interesaaat ?

ksaur avatar Nov 11 '20 00:11 ksaur

Thanks for doing this @ksaur, I completely forgot about this issue.

In setup.py, we have "onnxruntime>=1.0.0", so I'm a little surprised I didn't get the latest in Pypi. Should we pin to "onnxruntime>=1.5.0" @interesaaat ?

Yea it is strange that it didn't pull the latest ORT. And this is not even from the cache since it's a fresh container. In general I prefer to have the lowest supported version in the requirements, but I am ok with pinning to 1.5, if we cannot find any other workaround.

BTW does the test work if we use ORT 1.5?

interesaaat avatar Nov 11 '20 00:11 interesaaat

I learned the reason I was getting ORT==1.4.0 is that 1.5.1 requires an update to a newer version of pip. That makes me a bit cautious about pinning to onnxruntime==1.5.x just yet because I don't want to break other things...but maybe it's ok.

All of our existing tests pass with the new ORT, but I still get an error with the above code snippet.

Investigating. :)

ksaur avatar Nov 11 '20 01:11 ksaur

We have tests for boosting_type='rf' in test_lightgbm_converter.py but not test_onnxml_lightgbm_converter.py, and for this example we need the latter (to call convert_lightgbm).

In this test case i wrote, it fails with:

name: "shape_tensor"
}, 'tree_implementation': 'tree_trav', 'post_transform': <function convert_gbdt_common.<locals>.apply_sigmoid at 0x7f2ca3f7af28>}.
It usually means the pipeline being converted contains a
transformer or a predictor with no corresponding converter implemented.
Please fill an issue at https://github.com/microsoft/hummingbird.

Also, I had to add a check on the way we determine shapes in tree_ensembles.py (in the same test case above). Without checking the n_classes = t_values.shape[1] we get:

  File "/root/hummingbird/hummingbird/ml/operator_converters/onnx/tree_ensemble.py", line 190, in _get_tree_infos_from_tree_ensemble
    tree_infos, classes, post_transform = _get_tree_infos_from_onnx_ml_operator(operator)
  File "/root/hummingbird/hummingbird/ml/operator_converters/onnx/tree_ensemble.py", line 119, in _get_tree_infos_from_onnx_ml_operator
    n_classes = t_values.shape[1]
IndexError: tuple index out of range

Having not worked with LGBM before, I dug around to find more info on boosting_type='rf', but didn't find much. (pointers?) However I see that a test with this passes for torch.

Are we missing a converter in onnx/lgbm somewhere?

ksaur avatar Nov 11 '20 01:11 ksaur

I don't think we are missing a converter. Maybe that current onnx converter get mislead from the fact that lightgbm is a gradient boosting alo but it generates a random forest model. The tree implementations are ok since the convert works if we pass as input the lightgbm model directly, so I think it is something related to how the onnxmltool lightgbm converter translates the lightgbm model in an onnx one. These problems are super tricky to solve.

interesaaat avatar Nov 11 '20 01:11 interesaaat