FunASR 使用paraformer进行ASR时无法正确获取模型

使用paraformer进行ASR时无法正确获取模型

Open TristanLiu0101 opened this issue 2 years ago • 1 comments

trafficstars

OS: Linux

Python/C++ Version：python 3.9.17

Package Version： pytorch: 2.0.1 torchaudio: 2.0.2 modelscope: 1.9.0 funasr version（pip list）: 0.7.6

Model：speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch

Command： python paraformer_infer.py --wav_dir xxxx --output_dir xxxx (就是单纯地调用了inference)

Details： [code]: inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch', model_revision="v1.2.4", output_dir=result_dir, ) inference_pipeline(audio_in=wav_scp_path) （ps：同样的代码，之前能运行，但是wav.scp文件内视频太多时，会有timestamp溢出）

Error log：【本次问题：无法识别】 2023-09-13 23:02:54,876 - modelscope - INFO - PyTorch version 2.0.1 Found. 2023-09-13 23:02:54,876 - modelscope - INFO - Loading ast index from /home/code/liuzhechen/.cache/modelscope/ast_indexer 2023-09-13 23:02:54,982 - modelscope - INFO - Loading done! Current index file version is 1.9.0, with md5 6aeb508bfe7a12737daa49bb605d404b and a total number of 921 components indexed Traceback (most recent call last): File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/requests/models.py", line 971, in json return complexjson.loads(self.text, **kwargs) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/simplejson/init.py", line 514, in loads return _default_decoder.decode(s) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/simplejson/decoder.py", line 386, in decode obj, end = self.raw_decode(s) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/simplejson/decoder.py", line 416, in raw_decode return self.scan_once(s, idx=_w(s, idx).end()) simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/pipelines/util.py", line 35, in is_official_hub_impl _ = HubApi().get_model(path, revision=revision) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/hub/api.py", line 229, in get_model if is_ok(r.json()): File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/requests/models.py", line 975, in json raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/code/liuzhechen/paraformer_test/paraformer_infer.py", line 46, in inference_pipeline = pipeline( File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/pipelines/builder.py", line 106, in pipeline model = normalize_model_input( File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/pipelines/builder.py", line 26, in normalize_model_input if isinstance(model, str) and is_official_hub_path(model, model_revision): File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/pipelines/util.py", line 41, in is_official_hub_path return is_official_hub_impl(path) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/pipelines/util.py", line 38, in is_official_hub_impl raise ValueError(f'invalid model repo path {e}') ValueError: invalid model repo path Expecting value: line 1 column 1 (char 0)

【之前的问题：index越界】 Traceback (most recent call last): File "/home/code/liuzhechen/paraformer_test/paraformer_infer.py", line 49, in rec_result = inference_pipeline(audio_in=wav_scp_path) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/pipelines/audio/asr_inference_pipeline.py", line 256, in call output = self.forward(output, **kwargs) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/pipelines/audio/asr_inference_pipeline.py", line 505, in forward inputs['asr_result'] = self.run_inference(self.cmd, **kwargs) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/modelscope/pipelines/audio/asr_inference_pipeline.py", line 580, in run_inference asr_result = self.funasr_infer_modelscope(cmd['name_and_type'], File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/funasr/bin/asr_inference_launch.py", line 690, in _forward postprocessed_result = postprocess_utils.sentence_postprocess(token, time_stamp) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/funasr/utils/postprocess_utils.py", line 231, in sentence_postprocess word_lists, ts_lists = abbr_dispose(word_lists, ts_lists) File "/home/code/liuzhechen/miniconda3/envs/pyt2.0/lib/python3.9/site-packages/funasr/utils/postprocess_utils.py", line 127, in abbr_dispose begin = time_stamp[ts_nums[num]][0] IndexError: list index out of range

Sep 14 '23 03:09 TristanLiu0101

Do you use any multi-theads to decode with the pipeline?

Sep 25 '23 09:09 LauraGPT

FunASR FunASR copied to clipboard

使用paraformer进行ASR时无法正确获取模型

FunASR
FunASR copied to clipboard