is2022
is2022
Here is the log before the error: `2022-09-13 00:21:30,784 INFO [train_sh.py:890] Epoch 4, batch 8350, loss[loss=0.3493, simple_loss=0.3172, pruned_loss=0.1907, over 1959.00 frames. utt_duration=872 frames, utt_pad_proportion=0.06438, over 9.00 utterances.], tot_loss[loss=0.3648, simple_loss=0.3565, pruned_loss=0.1865,...
Any other info is needed?
@csukuangfj Is this what you mean: (Under supervisions) 'text': ['yeah', 'yeah', 'right', 'yes', 'yes', 'okay', 'mhm', 'really', 'yeah']
My version is: 'env_info': {'k2-version': '1.17', ..., 'k2-git-date': 'Mon Jul 25 02:11:54 2022', ...} so I need to update my k2 to get the fix you mentioned, right?
@csukuangfj Thank you very much the fix that you proposed seems to work. I still see a peculiar behaviour while training. The loss starts at .8 goes down to 0.5,...
@csukuangfj Unfortunately, I'm still getting the inf loss, after a few epochs. (this time epoch 3!) Here is the log: 2022-10-23 14:53:29,352 INFO [train.py:907] (0/8) Epoch 3, batch 115350, loss[loss=0.2266,...
It seems that it was not produced! Here is the list of files in the "egs/librispeech/ASR/conv_emformer_transducer_stateless2/exp" folder: -rw-r--r--. 1 root root 1208562503 Oct 24 12:11 best-train-loss.pt -rw-r--r--. 1 root root...
I restarted the training from the start of epoch 3 "--start-epoch 3" and it is in the middle of epoch 6 now. It seems very peculiar to me that it...
@csukuangfj This time I got the inf loss error in epoch 7. 2022-10-29 05:41:34,234 INFO [train_sh.py:907] (3/8) Epoch 7, batch 124300, loss[loss=0.1475, simple_loss=0.2006, pruned_loss=0.04714, over 2279.00 frames. utt_duration=1014 frames, utt_pad_proportion=0.0452,...
One thing that I'm confused about is that I have "--max-duration 280" but for this batch the total duration is more than 31 secs. Shouldn't it be less than 28...