tensorflow-onnx icon indicating copy to clipboard operation
tensorflow-onnx copied to clipboard

CTCGreedyDecoder leads to error if blank_index has highest activation in all steps

Open tigermeet28 opened this issue 2 years ago • 1 comments

Describe the bug

If the blank index has the highest activation in all time steps in greedy decoder input, I get the following error from the ReduceMax node in tf2onnx CTCGreedyDecoder implementation:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running ReduceMax node. Name:'ReduceMax__506' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\cpu\reduction\reduction_ops.cc:792 onnxruntime::ValidateKeepDims keepdims was false. Can't reduce on dim with value of 0 if 'keepdims' is false. Invalid output shape would be produced. input_shape:{0,2}

The original tf.nn.ctc_greedy_decoder doesn't produce any error in this case. It just outputs an empty Tensor as one would expect.

Urgency

I have implemented a workaround in my model, so that CTCGreedyDecoder is avoided when only blank indices were recognized, so it's not very urgent for me. But still, in my understanding this is a bug and should be fixed asap.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 18.04*): Tested on both Windows 10 and Ubuntu 20.04
  • TensorFlow Version: 2.7.0
  • Python version: 3.8
  • ONNX version (if applicable, e.g. 1.11*): 1.13.1
  • ONNXRuntime version (if applicable, e.g. 1.11*): 1.14.1

To Reproduce

Run CTCGreedyDecoder unit test with input where the blank index has the highest activation in all steps. Adjust inputs in tests.test_backend.BackendTests.test_ctc_greedy_decoder like this:

x_val = np.zeros((3, 4, 5)).astype(np.float32)
x_val[:, :, 4] = 1

tigermeet28 avatar Mar 21 '23 14:03 tigermeet28

It's more complicated to work around than I first thought. It's not enough to check if blank_index has highest acivation in all steps. It needs to be: All valid steps. So, if a sample has a smaller-than-max sequence length, let's say 1 in this example, class 0 for example can have a higher activation in a later step than blank and the test still fails:

x_val = np.zeros((4, 2, 3)).astype(np.float32)
x_val[:, :, 2] = 1  # Set blank_index 2 to high activation 1
x_val[3, 0, 0] = 1.1  # Set class 0 to even higher activation 1.1 in step 3 of the first sample
s_val = np.array([1, 4], np.int32)  # Set seq lengths so that the higher activation 1.1 is outside the valid range

leads to the same error:

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running ReduceMax node. Name:'ReduceMax__72' Status Message: D:\a\_work\1\s\onnxruntime\core\providers\cpu\reduction\reduction_ops.cc:792 onnxruntime::ValidateKeepDims keepdims was false. Can't reduce on dim with value of 0 if 'keepdims' is false. Invalid output shape would be produced. input_shape:{0,2}

tigermeet28 avatar Mar 23 '23 11:03 tigermeet28