tensorflow-onnx icon indicating copy to clipboard operation
tensorflow-onnx copied to clipboard

got nan output from converted model

Open leqiao-1 opened this issue 2 years ago • 0 comments

Describe the bug Model converted from tensorflow graphdef got nan outputs.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): windows
  • tf2onnx 1.11.1
  • tensorflow 2.9.1
  • onnx 1.12.0
  • onnxruntime 1.11.1
  • python 3.7

To Reproduce full_doran_frozen.zip

model conversion command python -m tf2onnx.convert --graphdef full_doran_frozen.pb --output model.onnx --inputs title_lengths:0,title_encoder:0,ratings:0,query_lengths:0,passage_lengths:0,features:0,encoder:0,decoder:0,Placeholder:0 --outputs output_identity:0,loss_identity:0

To run onnx and tensorflow model inference with given input data, run python inference.py

Screenshots If applicable, add screenshots to help explain your problem. image

leqiao-1 avatar Jul 01 '22 04:07 leqiao-1

The root cause is mainly from select op implementation. The select will choose Mul/Add ops implementation firstly which is faster than Where op. After debugging, the model gets wrong nan in pb file: image

The less op condition should be replaced in ONNX file: image

hwangdeyu avatar Sep 02 '22 07:09 hwangdeyu