Unsupported ONNX type 10 for FP16
from deepsparse import Pipeline
sa_pipeline = Pipeline.create( task="sentiment-analysis", model_path="/content/bert-sentiment-onnx-fp16-opset" )
inference = sa_pipeline("Aku suka itu") print(inference)
/usr/local/lib/python3.10/dist-packages/deepsparse/engine.py in run(self, inp, val_inp) 530 self._validate_inputs(inp) 531 --> 532 return self._eng_net.execute_list_out(inp) 533 534 def timed_run(
RuntimeError: NM: error: output[0]: 'logits' has unsupported type '<unsupported ONNX type 10>'
I performed the inference process for the hunggingface transformer model for the sentiment analysis task. But when I convert the transformer model to onnx with the fp16 option, an error appears as above. Is this a bug?
This command for export transformers to onnx model !optimum-cli export onnx --model /content/bert-base-indonesian-1.5G-sentiment-analysis-smsa bert-sentiment-onnx-fp16-opset/ --opset 13 --task text-classification --optimize 'O1' --device 'cuda' --fp16
HI @farizalmustaqim fp16 ONNX models aren't supported in DeepSparse or CPU runtimes generally, so please try your command with these edits
optimum-cli export onnx --model /content/bert-base-indonesian-1.5G-sentiment-analysis-smsa bert-sentiment-onnx-fp32-opset/ --opset 13 --task text-classification
Oh really, but I was able to do yolov8 inference using the onnx model with the f16 option to reduce the model size in the deepsparse pipeline. Is that not possible for NLP?
@farizalmustaqim That is interesting to hear, I guess it might be possible it would just run in a naive backend for sure. Even in the optimum codebase, they raise an exception if you try to export fp16 on a CPU device https://github.com/huggingface/optimum/blob/5017d06603488f396537e69ff77055907fae79d0/optimum/exporters/onnx/main.py#L295
Hi @farizalmustaqim As some time has passed with no further updates, I am going to go ahead and close out this issue. Please re-open if you want to continue the conversation. Best, Jeannie / Neural Magic