icefall
icefall copied to clipboard
Variations in onnx export (input and inference issues): pruned_transducer_stateless7_streaming
Hi,
I've a model using the pruned_transducer_stateless7_streaming recipe.
I've tried to export the model using the export-onnx.py and export.py with --onnx set to 1.
I've then used the onnx_pretrained.py file to test the exported model. While the export from export-onnx.py works fine with this test, the export from export.py fails. The issue was the missing meta_data from the export in export.py. I've followed the directives from export-onnx.py and modified export.py to include the export of meta_data.
However, even after exporting the model after including meta_data addition to export.py, I get the following error when trying to use the model with onnx_pretrained.py:
Traceback (most recent call last):
File "./pruned_transducer_stateless7_streaming/onnx_pretrained.py", line 512, in <module>
main()
File "/eph/nvme0/azureml/cr/j/516411a3479842ed943f65f1a697581c/exe/wd/vasista/k2_env/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "./pruned_transducer_stateless7_streaming/onnx_pretrained.py", line 486, in main
encoder_out = model.run_encoder(frames)
File "./pruned_transducer_stateless7_streaming/onnx_pretrained.py", line 299, in run_encoder
out = self.encoder.run(encoder_output_names, encoder_input)
File "/eph/nvme0/azureml/cr/j/516411a3479842ed943f65f1a697581c/exe/wd/vasista/k2_env/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 216, in run
self._validate_input(list(input_feed.keys()))
File "/eph/nvme0/azureml/cr/j/516411a3479842ed943f65f1a697581c/exe/wd/vasista/k2_env/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 198, in _validate_input
raise ValueError(
ValueError: Required inputs (['x_lens', 'len_cache', 'avg_cache', 'attn_cache', 'cnn_cache']) are missing from input feed (['x', 'cached_len_0', 'cached_len_1', 'cached_len_2', 'cached_len_3', 'cached_len_4', 'cached_avg_0', 'cached_avg_1', 'cached_avg_2', 'cached_avg_3', 'cached_avg_4', 'cached_key_0', 'cached_key_1', 'cached_key_2', 'cached_key_3', 'cached_key_4', 'cached_val_0', 'cached_val_1', 'cached_val_2', 'cached_val_3', 'cached_val_4', 'cached_val2_0', 'cached_val2_1', 'cached_val2_2', 'cached_val2_3', 'cached_val2_4', 'cached_conv1_0', 'cached_conv1_1', 'cached_conv1_2', 'cached_conv1_3', 'cached_conv1_4', 'cached_conv2_0', 'cached_conv2_1', 'cached_conv2_2', 'cached_conv2_3', 'cached_conv2_4']).
I've tried using the online-decode-files.py from sherpa-onnx on the model exported using export.py. That resulted in a Segmentation Fault error without any logs. The model exported using export-onnx.py worked fine with online-decode-files.py though.
Just to be sure, I've repeated the above steps available English model trained using the same architecture. The results are just the same as with the model I've trained.
Also, I've tried the export with the export.py file in the current commit and the one from release v1.1. They differ in terms of taking in the --tokens or --bpe_model at the time of exporting. However, both resulted in the same error log mentioned above.
Questions at this point:
- How do I run inference on the .onnx models exported with export.py? This is because our triton deployment uses the regular export.py
- Why are there two different exports to the same format (.onnx) and each of those models are requiring different inputs?
Sorry for the confusion.
Models exported using export.py work with triton in k2-fsa/sherpa.
Models exported using export-onnx.py work with onnx_pretrained.py and with k2-fsa/sherpa-onnx.
Is there some way (script) to test the models exported using export.py? While we are trying to deploy them on triton in k2-fsa/sherpa, we would also like to try some cli-inference using the same.
I am afraid there is no script to test it.
Can you follow onnx_pretrained.py to write one?