xyx361100238
xyx361100238
Not Yet!
pip3 install bandmat
fine-tune model based on whisper-base use wenetspeech datasets
use huggingface model “whisper-base”,test file [common_voice_zh-CN_18662117.mp3](https://huggingface.co/corner/whisper-base-zh/blob/main/common_voice_zh-CN_18662117.mp3),got the same error
` processor = WhisperProcessor.from_pretrained(model_path) asr_pipeline = pipeline(task="automatic-speech-recognition", model=model_path, device="cpu") transcription = processor.batch_decode("common_voice_zh-CN_18524189.wav", generate_kwargs={"language": lang, "task": "transcribe"}) ` tips error: 
According to #137 , I set -ac = 750, the result have lots of noise word “[buzzer] / [static] / [AUDIO OUT]”, how can I remove it? BTW,it's works well...
Yes Yes!Much better set -ac 768 :  add i will replace strings too. Thanks again!
Yes I have the same question: `x_lp[0] is Nan detected after pitch_downsample, so not filtered. total count:1` Why does Nan occur? I have check the RNNoise with same data,it's neverhappened
你好 使用最新重构后的代码执行并发,依然还是集中在单卡上运行