Nan
Nan
Thanks for your great work! I kewn model code based on wenet, could you tell what are used pipline ?
When I recognized one minute of Chinese audio, I found that there was no punctuation
warning: The current model is English-only but the language parameter is set to 'zh'; using 'en' instead.
when I fintune fellow https://github.com/huggingface/distil-whisper/tree/main/training. NotImplementedError: The model type whisper is not yet supported to be used with BetterTransformer.
Accelerate is a tool for multi-machine,but why you use it in single gpu?
I think this is a great work ! But there is exits limit as introduction. I want to know is there any better work based on SpeechGPT recently?
Thanks for your excellent work! I want to ask how the Discrete tokenizer's perform on the ASR?Can you tell me your understand? Thanks!