intel-extension-for-transformers
intel-extension-for-transformers copied to clipboard
Inference ONNX format model
Hi guys, Are you planning to inference onnx format model soon?
We already support onnx, please refer to: https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/deprecated/docs/deploy_and_integration.md
if you don't have other questions, i will close this issue.
Hi bmtuan, can you use the example?
C:\Windows\system32>D:\o\1\run_whisper.exe -l zh -m D:\o\1\whisper_gpu_int8_gpu-cuda_model.onnx -f D:\o\1\1.wav -osrt whisper_init_from_file_no_state: loading model from 'D:\o\1\whisper_gpu_int8_gpu-cuda_model.onnx' whisper_model_load: loading model NE_ASSERT: E:\whisper_opt\intel_extension_for_transformers\llm\runtime\graph\core\ne_layers.c:643: wtype != NE_TYPE_COUNT
P
C:\Windows\system32>D:\o\1\run_whisper.exe -l zh -m D:\o\1\whisper_gpu_int8_gpu-cuda_model.onnx -f D:\o\1\1.wav -osrt whisper_init_from_file_no_state: loading model from 'D:\o\1\whisper_gpu_int8_gpu-cuda_model.onnx' whisper_model_load: loading model NE_ASSERT: E:\whisper_opt\intel_extension_for_transformers\llm\runtime\graph\core\ne_layers.c:643: wtype != NE_TYPE_COUNT
This method uses the cpp model for inference, and the onnx model is not supported for the time being, if you want to use the onnx model for inference, please refer to :https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/deprecated/docs/deploy_and_integration.md