self-supervised-speech-recognition
self-supervised-speech-recognition copied to clipboard
Tuning lm_weight, word_score and beam_size
Hey
How do you recommend we tune the parameters in transcribe function: lm_weight, word_score, and beam_size? Normally with things like Deepspeech 2, we use its logits to tune this, but how about wav2vec?
Thanks
You can prepare a test set and perform a simple grid search for hyperparameter tuning
Yes, that's the last resort. That will be very slow because every time you infer using the acoustic model. With DS2's grid search, they freeze the logits and just search over the LM params, making the thing a lot faster. I was wondering if we can do something like that here.
I also meet this issue