FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

Cannot run whisper with spk model,frontend is none.

Open zhengxingmao opened this issue 1 year ago • 1 comments

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

  1. Run cmd '....'
from funasr import AutoModel

model = AutoModel(
    model="Whisper-large-v3",
    kwargs={"model_path": "/data/llvm/whisper/"},
    vad_model="iic/speech_fsmn_vad_zh-cn-16k-common-pytorch",
    vad_kwargs={"max_single_segment_time": 30000},
    punc_model="iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch",
    # return_spk_res=True,
    word_timestamps=True,
    spk_model="iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch",
    # spk_model="cam++",
    hub="openai",
)

res = model.generate(
    task="transcribe",
    batch_size_s=0,
    input="/root/桌面/音频/output_30.mp3",
    # input="/root/桌面/音频/asr_example_zh.wav",
    # sentence_timestamp=True,
    is_final=True,
)

print(res)
  1. See error
File "/data/llvm/whisper_t.py", line 16, in <module>
    res = model.generate(
  File "/usr/local/lib/python3.10/dist-packages/funasr/auto/auto_model.py", line 224, in generate
    return self.inference_with_vad(input, input_len=input_len, **cfg)
  File "/usr/local/lib/python3.10/dist-packages/funasr/auto/auto_model.py", line 358, in inference_with_vad
    spk_res = self.inference(speech_b, input_len=None, model=self.spk_model, kwargs=kwargs, **cfg)
  File "/usr/local/lib/python3.10/dist-packages/funasr/auto/auto_model.py", line 257, in inference
    res = model.inference(**batch, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/funasr/models/bicif_paraformer/model.py", line 253, in inference
    speech, speech_lengths = extract_fbank(audio_sample_list, data_type=kwargs.get("data_type", "sound"),
  File "/usr/local/lib/python3.10/dist-packages/funasr/utils/load_utils.py", line 131, in extract_fbank
    data, data_len = frontend(data, data_len, **kwargs)

Code sample

Expected behavior

Environment

  • OS (e.g., Linux):Linux
  • FunASR Version (e.g., 1.0.0):1.0.19
  • ModelScope Version (e.g., 1.11.0):1.13.2
  • PyTorch Version (e.g., 2.0.0):2.2.1
  • How you installed funasr (pip, source):pip
  • Python version:3.10.12
  • GPU (e.g., V100M32)
  • CUDA/cuDNN version (e.g., cuda11.7):
  • Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
  • Any other relevant information:

Additional context

zhengxingmao avatar Mar 29 '24 08:03 zhengxingmao

@LauraGPT Can you give some advise here to solve the problem ?

zhengxingmao avatar Apr 07 '24 02:04 zhengxingmao