Wav2Vec2_PyCTCDecode
Wav2Vec2_PyCTCDecode copied to clipboard
Word Level or Char Level language model?
Thanks @patrickvonplaten for this repo, it really helped a lot!
Just a question here, what is the best language model for CTC decoding? is it a character-level or word-level language model? I am assuming a character level should be the choice as wav2vec decodes characters. However, it seems that the practice is to use a word-level one. I notice that in many repos and posts. Please correct me if I am wrong. Also, if so, can you please elaborate on why word-level language models are preferred over char-level ones?
Did you find out? I'm facing with the same question