sklearn-onnx
sklearn-onnx copied to clipboard
Converted RandomForestClassifier has wrong prob when having multiple outputs
Description
When RandomForestClassifier has multiple outputs, the output probabilities from converted ONNX model are not correct (even not sum to 100%).
Repro
Code
import numpy as np
import sklearn
import skl2onnx
import onnxruntime
print(np.__version__)
print(sklearn.__version__)
print(skl2onnx.__version__)
print(onnxruntime.__version__)
np.random.seed(0)
model = sklearn.ensemble.RandomForestClassifier().fit(
X=np.random.randint(0, 3, size=(64, 2)),
y=np.random.randint(0, 3, size=(64, 2)), # (64, 1) is fine
)
print(model.predict_proba([[1, 1]]))
onnx = skl2onnx.convert_sklearn(
model=model,
initial_types=[('X', skl2onnx.common.data_types.Int64TensorType(shape=[None, 2]))],
options={'zipmap': False},
)
sess = onnxruntime.InferenceSession(onnx.SerializeToString())
print(sess.run(None, {'X': [[1, 1]]}))
Output
1.21.6
1.0.2
1.11.1
1.9.0
[array([[0.44302889, 0.4279672 , 0.12900391]]), array([[0.44302889, 0.42658636, 0.13038475]])]
[array([[0, 0]], dtype=int64), array([[[0.97, 0.94, 0.59]], [[0.97, 0.97, 0.62]]], dtype=float32)]
As you can see, the 1st output (from sklearn model) has the probabilities correctly sum to 100%, but the 2nd one (from ONNX model) is not.