sklearn-onnx
sklearn-onnx copied to clipboard
onnx_remove_node_identity causes graphs with supbgraphs to break
The ONNX Operators Loop
, Scan
, and If
can add subgraphs to an ONNX graph. These subgraphs can access variables from the scope above, which is important, because If
doesn't define the possibility to pass parameters to the subgraphs for the then
and else
branches.
Unfortunately, the optimization functiononnx_remove_node_identity
doesn't seem to handle this case of referenced variables from the top scope, resulting in removals of Identity
nodes in the top graph, but no renaming of the corresponding variable in the subgraph.
I think this is especially bad because onnx_remove_node_identity
is used by default when calling convert_sklearn
. Only by setting intermediate=False
the optimization isn't performed. This is also not documented.
I run into this problem. because I use custom converters, which uses the If
Operator.
Example as Python code
def example(X):
one = 1
zero = 0
identity_one = one
identity_zero = zero
y = None
if X:
y = identity_one
else:
y = identity_zero
return y
Example as Python code for ONNX graph creation
import onnx
from onnx import checker
from onnx import helper
from onnx import TensorProto as tp
then_branch = helper.make_graph(
[helper.make_node('Identity', inputs=["identity_one"], outputs=["then_result"])],
'then_branch',
[],
[helper.make_tensor_value_info('then_result', tp.INT64, [1])])
else_branch = helper.make_graph(
[helper.make_node('Identity', inputs=["identity_zero"], outputs=["else_result"])],
'else_branch',
[],
[helper.make_tensor_value_info('else_result', tp.INT64, [1])])
nodes = [
helper.make_node('Constant', inputs=[], outputs=["one"], value=helper.make_tensor(name='', data_type=tp.INT64, dims=[1], vals=[1])),
helper.make_node('Constant', inputs=[], outputs=["zero"], value=helper.make_tensor(name='', data_type=tp.INT64, dims=[1], vals=[0])),
helper.make_node('Identity', inputs=["one"], outputs=["identity_one"]),
helper.make_node('Identity', inputs=["zero"], outputs=["identity_zero"]),
helper.make_node('If', inputs=["X"], outputs=["y"], then_branch=then_branch, else_branch=else_branch),
]
g = helper.make_graph(nodes, 'if_test',
[helper.make_tensor_value_info('X', tp.BOOL, [1])],
[helper.make_tensor_value_info('y', tp.INT64, [1])])
# Create the model and check
m = helper.make_model(g)
checker.check_model(m)
onnx.save(m, 'if_test.onnx')
Loading this model with onnxruntime
works just fine:
import onnxruntime
onnxruntime.InferenceSession("if_test.onnx")
Optimising the model:
from skl2onnx.common.onnx_optimisation_identity import onnx_remove_node_identity
optimized_model = onnx_remove_node_identity(onnx_model)
Will output with debug log level:
DEBUG:skl2onnx:[NodeId-a] remove input: "one"
output: "identity_one"
op_type: "Identity"
DEBUG:skl2onnx:[NodeId-a] remove input: "zero"
output: "identity_zero"
op_type: "Identity"
But the model is broken now:
with open("optimized_if.onnx", "wb") as f:
f.write(optimized_model.SerializeToString())
onnxruntime.InferenceSession("optimized_if.onnx")
this will output:
InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from optimized_if.onnx failed:This is an invalid model. In Node, ("", If, "", -1) : ("X": tensor(bool),) -> ("y": tensor(int64),) , Error Nodes in a graph must be topologically sorted, however input 'identity_zero' of node: name: OpType: Identity is not output of any previous nodes.
Sorry for the delay. I'll look into that this week.
Changes have been merged and will be released in the next release.