self-supervised-speech-recognition icon indicating copy to clipboard operation
self-supervised-speech-recognition copied to clipboard

Tuning lm_weight, word_score and beam_size

Open mohitsharma29 opened this issue 4 years ago • 3 comments

Hey

How do you recommend we tune the parameters in transcribe function: lm_weight, word_score, and beam_size? Normally with things like Deepspeech 2, we use its logits to tune this, but how about wav2vec?

Thanks

mohitsharma29 avatar Jan 24 '21 05:01 mohitsharma29

You can prepare a test set and perform a simple grid search for hyperparameter tuning

mailong25 avatar Jan 25 '21 14:01 mailong25

Yes, that's the last resort. That will be very slow because every time you infer using the acoustic model. With DS2's grid search, they freeze the logits and just search over the LM params, making the thing a lot faster. I was wondering if we can do something like that here.

mohitsharma29 avatar Jan 25 '21 17:01 mohitsharma29

I also meet this issue

DatNgoBK avatar Feb 18 '21 03:02 DatNgoBK