FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

无人声音频时指定punc_model报错

Open OoKyleoO opened this issue 1 year ago • 0 comments
trafficstars

给任意无人声音频 audio_model = AutoModel( model="paraformer-zh", model_revision="v2.0.4", vad_model="fsmn-vad", vad_model_revision="v2.0.4", punc_model="ct-punc-c", punc_model_revision="v2.0.4" ) audio_model.generate(input=[audio], batch_size_s=200, is_final=True, sentence_timestamp=True) 报错: RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.DoubleTensor instead (while checking arguments for embedding)

debug后发现auto_model.py里面inference_with_vad方法 if self.punc_model is not None: if not len(result["text"]): if return_raw_text: result['raw_text'] = '' else: self.punc_kwargs.update(cfg) punc_res = self.inference(result["text"], model=self.punc_model, kwargs=self.punc_kwargs, **cfg) raw_text = copy.copy(result["text"]) if return_raw_text: result['raw_text'] = raw_text result["text"] = punc_res[0]["text"] else: raw_text = None

无人声时result["text"]为‘ ’有一个空格,长度为1 所以还是会走else里面的分支 可以修一下,用strip或者修改空text生成逻辑

OoKyleoO avatar Apr 17 '24 08:04 OoKyleoO