FunASR
FunASR copied to clipboard
模型初始化1次,连续转写两个音频文件,第一个会走gpu,第二个不会走gpu。
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
初始化1次模型后,连续识别两个文件,连个文件都是1个多小时的音频文件,大小都是100M左右。第一个看服务器监控比较正常也会走gpu,一个多小时的录音大概2分多钟能转写完。第二个看服务器监控就基本全部走的cpu了,大概要20分钟才能转写完。
代码:
`import json from datetime import datetime
from funasr import AutoModel
funasr_model = AutoModel(model="paraformer-zh", model_revision="v2.0.4", vad_model="fsmn-vad", vad_model_revision="v2.0.4", punc_model="ct-punc-c", punc_model_revision="v2.0.4", spk_model="cam++", spk_model_revision="v2.0.2", disable_update=True, # spk_threshold=0.7, # max_spk_num=2, # 限制最多2个说话人 #最大切割时长 毫秒 # vad_kwargs={"max_single_segment_time": 30000}, device="cuda:5" )
def demo1(): path = '/home/lingrui/speech-voice/1915686488729849856_1745570226125.wav' path2 = '/home/lingrui/speech-voice/test2-2024年7月.mp3' begin_time = datetime.now()
res = funasr_model.generate(input=path, use_itn=True, language="auto", merge_vad=True, hotword='', batch_size_s=600)
end_time = datetime.now()
print(f"{end_time - begin_time}")
res = funasr_model.generate(input=path2, use_itn=True, language="auto", merge_vad=True, hotword='',
batch_size_s=600)
json_str = json.dumps(res, ensure_ascii=False)
end_time = datetime.now()
print(f"{end_time - begin_time}")
conv = ''
sentence_info = res[0]['sentence_info']
for sentence in sentence_info:
conv = conv + f"spk {sentence['spk']} : {sentence['text']}\n"
print(conv)`