audio
audio copied to clipboard
How to train a real-time av-asr pretrain model
🚀 The feature
There is an example for hubert training here, but has no example about real-time av-asr for other languages.
Motivation, pitch
I'm woking on lipreading without a pretrained model to continue train the pretrained model like real-time av-asr.
Alternatives
No response
Additional context
No response