tensorflow-onnx icon indicating copy to clipboard operation
tensorflow-onnx copied to clipboard

Model conversion infinite loop in tf_utils.compute_const_folding_using_tf

Open hgong-snap opened this issue 3 years ago • 1 comments

Describe the bug I tried to convert my model(graphdef) with the tf2onnx tool, but it takes forever to do the conversion. After debugging, it looks like the code stuck at tf2onnx.tf_utils.compute_const_folding_using_tf, particularly this while-loop

Urgency

If there are particular important use cases blocked by this or strict project-related timelines, please share more information and dates. If there are no hard deadlines, please specify none.

We are trying to experiment/benchmark onnxruntime in our service and might want to switch to onnxruntime if it provides better performance v.s. TF. So prefer to get it solved sooner.

System information

  • MacOS
  • Tensorflow Version: 2.7.0
  • Python version: 3.8.13

To Reproduce a zip file. run with python convert.py (have to use google drive because 37MB > github max file size is 25MB)

Screenshots

Cannot infer shape for LookupTableExportV2_11: LookupTableExportV2_11:0,LookupTableExportV2_11:1
Cannot infer shape for LookupTableExportV2_12: LookupTableExportV2_12:0,LookupTableExportV2_12:1
Cannot infer shape for LookupTableExportV2_13: LookupTableExportV2_13:0,LookupTableExportV2_13:1
Cannot infer shape for LookupTableExportV2_14: LookupTableExportV2_14:0,LookupTableExportV2_14:1
Cannot infer shape for LookupTableExportV2_15: LookupTableExportV2_15:0,LookupTableExportV2_15:1
Cannot infer shape for LookupTableExportV2_16: LookupTableExportV2_16:0,LookupTableExportV2_16:1
Cannot infer shape for LookupTableExportV2_17: LookupTableExportV2_17:0,LookupTableExportV2_17:1
Cannot infer shape for LookupTableExportV2_18: LookupTableExportV2_18:0,LookupTableExportV2_18:1

after these warnings it takes forever.

hgong-snap avatar Jun 13 '22 19:06 hgong-snap

not sure if there's some corner case here. strided_slice nodes are added to outputs_to_values and set progress=True, then in here we delete strided_slice nodes here. Then next loop starts because progress=True, thus continuing this infinite loop.

hgong-snap avatar Jun 15 '22 06:06 hgong-snap

any followup on the issue?

hgong-snap avatar Oct 05 '22 05:10 hgong-snap