onnxruntime Scikit-learn model converted to ONNX results in different output shapes between Python and Java environments

Describe the bug I was trying to train a scikit-learn model in Python, export it to ONNX and then use the model for prediction in a Java environment. The scikit-learn model was converted to ONNX using the skl2onnx Python package and loaded using ai.onnxruntime in Java.

However, the output was not complete when making predictions in Java. The predicted probabilities did not contain the entire probability array but only the first index of the array. When testing the ONNX model output in Python, the probability array was complete.

This issue persisted with different models (SVM, Logistic Regression, MLP) as well as different scikit-learn objects (a single classifier, a classifier as part of a Pipeline).

System information

macOS 11.6.6
JDK version: 11.0
Python version: 3.9.5
ONNX Runtime installed from (source or binary): pip
ONNX Runtime version: 1.10.0
ONNX version: 1.11.0
skl2onnx version: 1.11.0

To Reproduce

Export a scikit-learn model to ONNX using skl2onnx in Python
Load ONNX model in Java using ai.onnxruntime
Make a prediction

Expected behavior

The expected output is a list of length 2
The first index is the predicted class
The second index should be the array of predicted probabilities for every class in the target variable. However, in Java, only the first index of the probability array is returned

Aug 17 '22 14:08 dan-kur

How are you accessing the outputs in Java? And can you supply an example model?

Aug 17 '22 15:08 Craigacp

I checked one of the test models (a logistic regression exported from scikit-learn), and I got the expected output of a tensor containing the predicted labels, and a sequence of maps containing the labels and probabilities for each example. There might be some oddities in the handling of the zipmap, so how was the model exported from scikit-learn?

Aug 18 '22 02:08 Craigacp

The model was exported using the to_onnx method of skl2onnx. In the options, I specified "zipmap": False. When I do inference with the model on Python, I do get the full probability array.

On the Java side, I am mapping the probabilities output to an OnnxSequence. When I look at the info of the sequence, length=1. If I print it out, I get an array with only one value.

It looks like:

OrtSession.Result pred = session.run(onnxInputMap);
OnnxSequence probabilities = (OnnxSequence) pred.get(1);
System.out.println(probabilities);
System.out.println(probabilities.getValue());

Aug 18 '22 14:08 dan-kur

Ok, I'll look at it. The spec for the behaviour of the onnx sequence seems underspecified, currently in Java it is written to pull out the first element of each tensor in the sequence because in the test examples I had all sequences contained a list of single element tensors. If it can in fact return a sequence of non-scalar tensors then I'll need to refactor the logic in Java and the native bridge.

Aug 18 '22 18:08 Craigacp

I exported a logistic regression from scikit-learn using onx = to_onnx(lr,x.astype(numpy.float32),target_opset=12,options={type(model): {'zipmap': False}}) and when inspecting that model in Java I see that the output is a tensor:

jshell> session.getOutputInfo()
$11 ==> {
label=NodeInfo(name=label,info=TensorInfo(javaType=INT64,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64,shape=[-1])),
probabilities=NodeInfo(name=probabilities,info=TensorInfo(javaType=FLOAT,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT,shape=[-1, 3]))
}

and those probabilities are all returned as outputs.

Can you provide more information so I can replicate this? Either an ONNX model or the exact series of commands you use to generate it?

Aug 19 '22 01:08 Craigacp

I tried to share the model here but that file type does not seem to be supported. If there is an alternative place I can share it, please let me know.

However, the model is as such:

the classifier is a logistic regression
it is wrapped in a sklearn.multioutput.MultiOutputClassifier object
it is then wrapped in a sklearn.pipeline.Pipeline object (the first step was a vectorizer)

When I examined the output of the model in Java, I got the following: label=NodeInfo(name=label,info=TensorInfo(javaType=INT64,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64,shape=[-1, 1])), probabilities=NodeInfo(name=probabilities,info=SequenceInfo(length=UNKNOWN,type=FLOAT))}

However, when I tried exporting only a logistic regression, I got a similar output type as you for the probabilities.

I suspect the difference is with the MultiOutputClassifier and/or the Pipeline objects, and they are leading to the probabilities being cast to an OnnxSequence rather than an OnnxTensor.

Aug 19 '22 16:08 dan-kur

You can email the model to me at [email protected], but I'm out on vacation next week and won't look at it until the week after.

Aug 19 '22 19:08 Craigacp

Thanks! I sent you an email.

Aug 19 '22 20:08 dan-kur

Could you test out this branch - https://github.com/Craigacp/onnxruntime/tree/java-sequence-fix. I've modified OnnxSequence.getValue() so it now returns either List<OnnxTensor> or List<Map<Object,Object>> depending on the sequence element type. It passes the ported C# test for operating on sequences of tensors, but I'm not entirely happy with the semantics as I think it should either return pure Java side values or Onnx values rather than a mixture. I'm leaning towards making it return List<OnnxTensor> and List<OnnxMap>, because forcing the creation of multidimensional java arrays ruins any performance in the system, but that's a bit more work than I have time for at the moment.

Sep 17 '22 01:09 Craigacp

I found a little more time and moved OnnxSequence over so that getValue now returns List<? extends OnnxValue> which means both types of things supported in a sequence now behave the same way.

Sep 19 '22 17:09 Craigacp