onnx_remove_node_identity causes graphs with supbgraphs to break
The ONNX Operators Loop, Scan, and If can add subgraphs to an ONNX graph. These subgraphs can access variables from the scope above, which is important, because If doesn't define the possibility to pass parameters to the subgraphs for the then and else branches.
Unfortunately, the optimization functiononnx_remove_node_identity doesn't seem to handle this case of referenced variables from the top scope, resulting in removals of Identity nodes in the top graph, but no renaming of the corresponding variable in the subgraph.
I think this is especially bad because onnx_remove_node_identity is used by default when calling convert_sklearn. Only by setting intermediate=False the optimization isn't performed. This is also not documented.
I run into this problem. because I use custom converters, which uses the If Operator.
Example as Python code
def example(X):
one = 1
zero = 0
identity_one = one
identity_zero = zero
y = None
if X:
y = identity_one
else:
y = identity_zero
return y
Example as Python code for ONNX graph creation
import onnx
from onnx import checker
from onnx import helper
from onnx import TensorProto as tp
then_branch = helper.make_graph(
[helper.make_node('Identity', inputs=["identity_one"], outputs=["then_result"])],
'then_branch',
[],
[helper.make_tensor_value_info('then_result', tp.INT64, [1])])
else_branch = helper.make_graph(
[helper.make_node('Identity', inputs=["identity_zero"], outputs=["else_result"])],
'else_branch',
[],
[helper.make_tensor_value_info('else_result', tp.INT64, [1])])
nodes = [
helper.make_node('Constant', inputs=[], outputs=["one"], value=helper.make_tensor(name='', data_type=tp.INT64, dims=[1], vals=[1])),
helper.make_node('Constant', inputs=[], outputs=["zero"], value=helper.make_tensor(name='', data_type=tp.INT64, dims=[1], vals=[0])),
helper.make_node('Identity', inputs=["one"], outputs=["identity_one"]),
helper.make_node('Identity', inputs=["zero"], outputs=["identity_zero"]),
helper.make_node('If', inputs=["X"], outputs=["y"], then_branch=then_branch, else_branch=else_branch),
]
g = helper.make_graph(nodes, 'if_test',
[helper.make_tensor_value_info('X', tp.BOOL, [1])],
[helper.make_tensor_value_info('y', tp.INT64, [1])])
# Create the model and check
m = helper.make_model(g)
checker.check_model(m)
onnx.save(m, 'if_test.onnx')
Loading this model with onnxruntime works just fine:
import onnxruntime
onnxruntime.InferenceSession("if_test.onnx")
Optimising the model:
from skl2onnx.common.onnx_optimisation_identity import onnx_remove_node_identity
optimized_model = onnx_remove_node_identity(onnx_model)
Will output with debug log level:
DEBUG:skl2onnx:[NodeId-a] remove input: "one"
output: "identity_one"
op_type: "Identity"
DEBUG:skl2onnx:[NodeId-a] remove input: "zero"
output: "identity_zero"
op_type: "Identity"
But the model is broken now:
with open("optimized_if.onnx", "wb") as f:
f.write(optimized_model.SerializeToString())
onnxruntime.InferenceSession("optimized_if.onnx")
this will output:
InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from optimized_if.onnx failed:This is an invalid model. In Node, ("", If, "", -1) : ("X": tensor(bool),) -> ("y": tensor(int64),) , Error Nodes in a graph must be topologically sorted, however input 'identity_zero' of node: name: OpType: Identity is not output of any previous nodes.
Sorry for the delay. I'll look into that this week.
Changes have been merged and will be released in the next release.