wav2letter
wav2letter copied to clipboard
Empty predictions/100 WER after training conv_glu on a different language
I have been trying to train conv_glu
on a custom dataset. The training process starts and continues on without errors. However, the WER/TER stays at 100 for a large number of epochs while the loss decreases. When I use the saved acoustic model with the Test
binary, the predictions are empty which essentially means that the error rate was due to all the characters being deleted! To distill this problem to a particular source, I tried using a single example with the alphabet file modified accordingly. The problem still persists! I am sure I'm missing something quite simple, can anyone point me in the right direction?
Attachments
-
Token file for single toy example: alphabet.txt
-
File and transcription list (same list for Train, Dev and Test): dummy.txt
-
Lexicon file (Created using the whole corpus, but contains the required words): lexicon (1).txt
-
LM (Same as the lexicon file, created using the whole corpus): https://drive.google.com/file/d/1qsFyA3DpoHr9F39zIuc5JEsZRfT71eZS/view?usp=sharing
-
Test Log (Contains params and empty transcription): test_log.txt
could you try to run with --showletters=true
and put here the log? Could you post also your training log/config?
@tlikhomanenko Similar results with --showletters=true
! Log file : test_log (1).txt. On a side note, isn't --show=true
the same thing?
- Training Log: 001_log.txt
- Training Config: 001_config.txt
Ohh, you run test.cpp not decode.cpp. For decode.cpp "show" shows words transcriptions while "showletters" show the tokens transcription. For test.cpp we are printing only words transcription always so showletters is ignored.
Ok, could you run you test.cpp with --uselexicon=true
? Could you also post the file wav2letter/recipes/models/conv_glu/librispeech/train_ssnl.cfg? So far your config looks fine, probably the problem with training itself, your loss going down, and then stuck, one thing I would try is to stop process after 3 epochs and try to run test.cpp to be sure that this state with empty output appears only after some training.
@tlikhomanenko Train_ssnl.cfg : train_ssnl.txt
Same results with --uselexicon=true
and after stopping the training process after 3 epochs. Just to clarify, is there a difference between the test
and decode
options when passed to Test.cpp
?
Hi,
It looks like you are not using --linseg=
flag for training. You might want to use --linseg=10000
to make sure WER goes below 100. Tuning --lr
and --lrcrit
might also help.
(In older version of wavl2etter, you should use linseg=1
instead as we used to count in epochs before.)
@vineelpratap here is ctc criterion, not asg in the config.
Oops, nevermind then !
Yeah, it's CTC. Any idea why this is happening though?
On Thu, 22 Oct, 2020, 10:22 PM Vineel Pratap, [email protected] wrote:
Oops, nevermind then !
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/facebookresearch/wav2letter/issues/846#issuecomment-714739390, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXNCWCUKUWOYTRSVNDOYMLSMCIAVANCNFSM4R7ME63Q .
First, is this audio really 285476ms ~= 5min with this short transcription? because then it makes totally sense for the model training on one this sample to predict all blanks/silence, could you confirm that this audio duration and short transcription are correct?
That's the size of the audio file, not the duration. The duration is much smaller at about 4s. But is it required that the number be the time in milliseconds? The documentation says it is just a real number used to sort data (which can be audio duration)!
From Data Preparation
size - a real number used for sorting the dataset (typically audio duration in milliseconds).
yep, correct, you can use any number. Just was wondering if the problem with the very long input training.
No idea for now why it doesn't work. First I would try to take one librispeech sample and try to train on it. If you will have the same problem, just post here the sample and all your config files I will try to run the same thing and debug what is not working.