FunASR How to chose quantizing value when exporting? 16 bit vs 8 bit (for streaming)

How to chose quantizing value when exporting? 16 bit vs 8 bit (for streaming)

Open andreystarenky opened this issue 1 year ago • 0 comments

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节）

❓ Questions and Help

Before asking:

search the issues.
search the docs.

What is your question?

I am trying to quantize a streaming model which has a decoder.onnx and a model.onnx. Is there a way to change how it quantizes (8 vs 16) to have a larger or smaller model_quant.onnx and decoder_quant.onnx

Code

What have you tried?

I tried the onnxruntime.quantizeation quantize_dynamic() but it doesn't work when there is a model and decoder. I have also looked into the funasr model.export() function but couldn't find an option for changing quantization.

What's your environment?

OS (e.g., Linux):
FunASR Version (e.g., 1.0.0):
ModelScope Version (e.g., 1.11.0):
PyTorch Version (e.g., 2.0.0):
How you installed funasr (pip, source):
Python version:
GPU (e.g., V100M32)
CUDA/cuDNN version (e.g., cuda11.7):
Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
Any other relevant information:

Aug 28 '24 00:08 andreystarenky

FunASR FunASR copied to clipboard

How to chose quantizing value when exporting? 16 bit vs 8 bit (for streaming)

❓ Questions and Help

Before asking:

What is your question?

Code

What have you tried?

What's your environment?

FunASR
FunASR copied to clipboard