FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

A10卡GPU推理效率和CPU持平,不清楚是什么地方的问题

Open lanyuer opened this issue 1 year ago • 1 comments

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

参考:https://github.com/modelscope/FunASR/blob/e8f535f53320780cd8ed6f3b8588b187935d3ae5/runtime/onnxruntime/readme.md 编译出onnxruntime的二进制版本,也打开了GPU=ON

开启量化后的合成效果加速比最大只有300左右,和CPU版本非常接近。看GPU利用率确实也有70%左右,这个是为什么呢。

Code

编译命令: cmake -DCMAKE_BUILD_TYPE=release .. -DONNXRUNTIME_DIR=/home/ubuntu/github/FunASR/onnxruntime-linux-x64-1.14.0 -DFFMPEG_DIR=/home/ubuntu/github/FunASR/ffmpeg-master-latest-linux64-gpl-shared -DGPU=on

模型导出方式:

funasr-export ++model=damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch ++quantize=true ++device=cuda ++type=torchscript

推理命令:

funasr-onnx-offline-rtf --model-dir /home/ubuntu/.cache/modelscope/hub/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --vad-dir /home/ubuntu/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch --punc-dir /home/ubuntu/.cache/modelscope/hub/damo/punc_ct-transformer_cn-en-common-vocab471067-large --gpu --thread-num 20 --batch-size 48 --quantize true --wav-path ./test100.scp

What have you tried?

What's your environment?

  • OS (e.g., Linux):
  • FunASR Version (e.g., 1.0.0):
  • ModelScope Version (e.g., 1.11.0):
  • PyTorch Version (e.g., 2.0.0):
  • How you installed funasr (pip, source):
  • Python version:
  • GPU (e.g., V100M32)
  • CUDA/cuDNN version (e.g., cuda11.7):
  • Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
  • Any other relevant information:

python=3.8 funasr、modelscope都是最新的

lanyuer avatar Aug 27 '24 13:08 lanyuer

GPU部署请参考 https://github.com/modelscope/FunASR/blob/main/runtime/docs/SDK_advanced_guide_offline_gpu_zh.md

lyblsgo avatar Sep 26 '24 07:09 lyblsgo

因为paraformer里的cif模块,动态循环在gpu上会非常慢

hanfanggithub avatar Jun 24 '25 02:06 hanfanggithub