onnxmltools
onnxmltools copied to clipboard
`XGBRFClassifier` & `XGBRFRegressor` are not supported
Hi,
I'm trying to convert both XGBRFClassifier
& XGBRFRegressor
models into ONNX
, but unfortunately there's no support for such models as I can also see from the source code here. Are there any plans for extended support for such models in the future?
Not yet. If only the training is different from XGBRgressor, it should not be too difficult to add. Let me investigate.
@xadupre That would be great if there's a workaround on this!
I created a PR to support these models. It seems to work with the same converter but it could be helpful to check on your side as well.
@xadupre Thanks! Would you mind sending a link to the PR since I'm interested to know when it might make it to a release?
EDIT: I found it no worries. Let me know when/how to test this I'd be glad to do so.
https://github.com/onnx/onnxmltools/pull/665 (it appears above but maybe that's just for me).
@xadupre yep saw that
Hi @xadupre ! Do you have an estimate on when this is gonna make it to a next release?
We plan to release a new version before December 14th.
Thanks, much appreciated!
The new version was released and fixes this issue. Feel free to reopen the issue if it does not work.
I'll try and let you know, thank you!
Hi @xadupre, I've tested the newest version with XGBRFClassifier
and seems the ONNX
probabilities are not matching those of the original model.
I got a simple XGBRFClassifier
trained as follows:
dataset = load_iris()
X, y = dataset.data, dataset.target
model = XGBRFClassifier()
model.fit(X, y)
and run the following code to get predictions and probabilities from both the original and the ONNX
equivalent:
onnx_model = onnx.load((tmp_path / "model.onnx").as_posix())
session = rt.InferenceSession((tmp_path / "model.onnx").as_posix())
data = np.random.rand(2, 4).astype(np.float32)
model_predictions = model.predict(data)
model_probabilities = model.predict_proba(data)
onnx_predictions, onnx_probabilities = session.run(
["label", "probabilities"], {"input": data}
)
However I'm getting different probabilities in the two occasions:
I'm using TARGET_OPSET=15
. Also, shouldn't all ONNX
probabilities sum up to 1 as with the original model?
I'll have a look today.
I was able to replicate the bug. I assumed XGBClassifier and XGBRFClassifier were the same when it comes to prediction since the code is the same. It turns out they are the same if n_estimators=1
and different for any value above. I did not find out where the difference is and how to retrieve that information at conversion time. I just know method get_num_boosting_rounds is different. It is only used at training. Maybe if you tell me what you are using XGBRFClassifier and not XGBClassifier, I would be able to guess where I should look at.
@xadupre I'm using the default values for XGBRFClassifier
during training, just for testing purposes. Thanks for having a look at this, much appreciated!
Hi @xadupre! Is it something trivial to fix or it seems like a bigger issue? Anything I can help you with?
Sorry, I did not have time to look into this this week. It is not trivial to find (at least for me). Since the conversion of trees is working, I'm usually looking for wrong base values or a wrong number of estimators (early stopping). I parsed the dumped trees to look for additional information but nothing was obvious. My next step is to compare the dump between XGBClassifier and XGBRFClassifier on the same data to understand what the differences are. If the dump is different, then it is a converting issue. If the dump is the same, then the code for inference is different and I need to reflect that somehow in the onnx graph. Here is my current status.
I see, no worries! Please let me know if there any developments on this bug fix :)
Hi @xadupre, is there any progress on this bug?
No sorry, I was busy with something else. I'll try to do it this month.
@xadupre is there any progress on this?
I did not have time to work on this, doing some work with pytorch and onnx. I have 2 or 3 issues to fix on skl2onnx and onnxmltools. I planned to spend 2 or 3 days on them by the end of the month. Sorry for the delay again.