FunASR
FunASR copied to clipboard
无人声音频时指定punc_model报错
给任意无人声音频
audio_model = AutoModel( model="paraformer-zh", model_revision="v2.0.4", vad_model="fsmn-vad", vad_model_revision="v2.0.4", punc_model="ct-punc-c", punc_model_revision="v2.0.4" ) audio_model.generate(input=[audio], batch_size_s=200, is_final=True, sentence_timestamp=True)
报错:
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.DoubleTensor instead (while checking arguments for embedding)
debug后发现auto_model.py里面inference_with_vad方法
if self.punc_model is not None: if not len(result["text"]): if return_raw_text: result['raw_text'] = '' else: self.punc_kwargs.update(cfg) punc_res = self.inference(result["text"], model=self.punc_model, kwargs=self.punc_kwargs, **cfg) raw_text = copy.copy(result["text"]) if return_raw_text: result['raw_text'] = raw_text result["text"] = punc_res[0]["text"] else: raw_text = None
无人声时result["text"]为‘ ’有一个空格,长度为1 所以还是会走else里面的分支 可以修一下,用strip或者修改空text生成逻辑