Tatiana Likhomanenko
Tatiana Likhomanenko
Ohh, you run test.cpp not decode.cpp. For decode.cpp "show" shows words transcriptions while "showletters" show the tokens transcription. For test.cpp we are printing only words transcription always so showletters is...
@vineelpratap here is ctc criterion, not asg in the config.
First, is this audio really 285476ms ~= 5min with this short transcription? because then it makes totally sense for the model training on one this sample to predict all blanks/silence,...
yep, correct, you can use any number. Just was wondering if the problem with the very long input training. No idea for now why it doesn't work. First I would...
Hi @alkazap, Thanks for finding the bug, could you mind to send PR on the fix?
For most papers we release pre-trained models (acoustic and language models), so you can check here https://github.com/facebookresearch/wav2letter/tree/master/recipes/models in each folder, mainly in the readme there could be links to the...
One more thing: are you fine tuning the wav2vec features with the whole net or not? First start with frozen wav2vec features.
I think better to ask directly Mr Mai Long how he reproduced then. As far as I know in original paper they use frozen wav2vec features.
Yep, you can implement Transformer library with Flashliligh and we already have several implementations for it. What do you mean exactly by "Is such a feature planned to be released...
I think we don't support this option cc @jacobkahn