onnxmltools icon indicating copy to clipboard operation
onnxmltools copied to clipboard

`XGBRFClassifier` & `XGBRFRegressor` are not supported

Open pvardanis opened this issue 1 year ago • 22 comments

Hi,

I'm trying to convert both XGBRFClassifier & XGBRFRegressor models into ONNX, but unfortunately there's no support for such models as I can also see from the source code here. Are there any plans for extended support for such models in the future?

pvardanis avatar Nov 10 '23 11:11 pvardanis

Not yet. If only the training is different from XGBRgressor, it should not be too difficult to add. Let me investigate.

xadupre avatar Nov 17 '23 14:11 xadupre

@xadupre That would be great if there's a workaround on this!

pvardanis avatar Nov 17 '23 14:11 pvardanis

I created a PR to support these models. It seems to work with the same converter but it could be helpful to check on your side as well.

xadupre avatar Nov 17 '23 16:11 xadupre

@xadupre Thanks! Would you mind sending a link to the PR since I'm interested to know when it might make it to a release?

EDIT: I found it no worries. Let me know when/how to test this I'd be glad to do so.

pvardanis avatar Nov 17 '23 16:11 pvardanis

https://github.com/onnx/onnxmltools/pull/665 (it appears above but maybe that's just for me).

xadupre avatar Nov 17 '23 16:11 xadupre

@xadupre yep saw that

pvardanis avatar Nov 17 '23 16:11 pvardanis

Hi @xadupre ! Do you have an estimate on when this is gonna make it to a next release?

pvardanis avatar Nov 20 '23 10:11 pvardanis

We plan to release a new version before December 14th.

xadupre avatar Nov 20 '23 10:11 xadupre

Thanks, much appreciated!

pvardanis avatar Nov 20 '23 11:11 pvardanis

The new version was released and fixes this issue. Feel free to reopen the issue if it does not work.

xadupre avatar Dec 18 '23 12:12 xadupre

I'll try and let you know, thank you!

pvardanis avatar Dec 18 '23 12:12 pvardanis

Hi @xadupre, I've tested the newest version with XGBRFClassifier and seems the ONNX probabilities are not matching those of the original model.

I got a simple XGBRFClassifier trained as follows:

dataset = load_iris()
X, y = dataset.data, dataset.target
model = XGBRFClassifier()
model.fit(X, y)

and run the following code to get predictions and probabilities from both the original and the ONNX equivalent:

onnx_model = onnx.load((tmp_path / "model.onnx").as_posix())
session = rt.InferenceSession((tmp_path / "model.onnx").as_posix())

data = np.random.rand(2, 4).astype(np.float32)
model_predictions = model.predict(data)
model_probabilities =  model.predict_proba(data)
onnx_predictions, onnx_probabilities = session.run(
            ["label", "probabilities"], {"input": data}
        )

However I'm getting different probabilities in the two occasions: image

I'm using TARGET_OPSET=15. Also, shouldn't all ONNX probabilities sum up to 1 as with the original model?

pvardanis avatar Jan 04 '24 13:01 pvardanis

I'll have a look today.

xadupre avatar Jan 04 '24 14:01 xadupre

I was able to replicate the bug. I assumed XGBClassifier and XGBRFClassifier were the same when it comes to prediction since the code is the same. It turns out they are the same if n_estimators=1 and different for any value above. I did not find out where the difference is and how to retrieve that information at conversion time. I just know method get_num_boosting_rounds is different. It is only used at training. Maybe if you tell me what you are using XGBRFClassifier and not XGBClassifier, I would be able to guess where I should look at.

xadupre avatar Jan 04 '24 19:01 xadupre

@xadupre I'm using the default values for XGBRFClassifier during training, just for testing purposes. Thanks for having a look at this, much appreciated!

pvardanis avatar Jan 05 '24 10:01 pvardanis

Hi @xadupre! Is it something trivial to fix or it seems like a bigger issue? Anything I can help you with?

pvardanis avatar Jan 10 '24 09:01 pvardanis

Sorry, I did not have time to look into this this week. It is not trivial to find (at least for me). Since the conversion of trees is working, I'm usually looking for wrong base values or a wrong number of estimators (early stopping). I parsed the dumped trees to look for additional information but nothing was obvious. My next step is to compare the dump between XGBClassifier and XGBRFClassifier on the same data to understand what the differences are. If the dump is different, then it is a converting issue. If the dump is the same, then the code for inference is different and I need to reflect that somehow in the onnx graph. Here is my current status.

xadupre avatar Jan 10 '24 10:01 xadupre

I see, no worries! Please let me know if there any developments on this bug fix :)

pvardanis avatar Jan 11 '24 19:01 pvardanis

Hi @xadupre, is there any progress on this bug?

pvardanis avatar Feb 13 '24 09:02 pvardanis

No sorry, I was busy with something else. I'll try to do it this month.

xadupre avatar Feb 13 '24 10:02 xadupre

@xadupre is there any progress on this?

pvardanis avatar Mar 18 '24 12:03 pvardanis

I did not have time to work on this, doing some work with pytorch and onnx. I have 2 or 3 issues to fix on skl2onnx and onnxmltools. I planned to spend 2 or 3 days on them by the end of the month. Sorry for the delay again.

xadupre avatar Mar 18 '24 15:03 xadupre