FireRedASR icon indicating copy to clipboard operation
FireRedASR copied to clipboard

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recogn...

Results 65 FireRedASR issues
Sort by recently updated
recently updated
newest added

speech2text.py --asr_type llm --model_dir /root/share/FireRedASR/examples/pretrained_models/FireRedASR-LLM-L --batch_size 1 --beam_size 3 --decode_max_len 0 --decode_min_len 0 --repetition_penalty 3.0 --llm_length_penalty 1.0 --temperature 1.0 --wav_scp wav/wav.scp --output out/llm-l-asr.txt Namespace(asr_type='llm', model_dir='/root/share/FireRedASR/examples/pretrained_models/FireRedASR-LLM-L', wav_path=None, wav_paths=None, wav_dir=None, wav_scp='wav/wav.scp', output='out/llm-l-asr.txt',...

will model config.yaml and finetune code be opensource ? diffrence scenes need diffrence data finetune maybe ,hope for that!

Wath is the lora's configured for FireRedASR-LLM training ?

$ speech2text.py --wav_path examples/wav/BAC009S0764W0121.wav --asr_type "aed" --model_dir pretrained_models/FireRedASR-AED-L Namespace(asr_type='aed', model_dir='pretrained_models/FireRedASR-AED-L', wav_path='examples/wav/BAC009S0764W0121.wav', wav_paths=None, wav_dir=None, wav_scp=None, output=None, use_gpu=1, batch_size=1, beam_size=1, decode_max_len=0, nbest=1, softmax_smoothing=1.0, aed_length_penalty=0.0, eos_penalty=1.0, decode_min_len=0, repetition_penalty=1.0, llm_length_penalty=0.0, temperature=1.0) 開始執行:模型加載與轉錄程序 檢查點成功:模型檔案存在於 pretrained_models/FireRedASR-AED-L/model.pth...

RT。请问有计划支持返回句子级的时间戳吗? 如果把一个音频文件拆分为多个小的音频文件,是否会影响识别效果,类似whisper生成模式下效果会显著下降