Umberto Cappellazzo
Umberto Cappellazzo
^CTraceback (most recent call last): File "/cappellazzo/icefall_forked/icefall/egs/librispeech/ASR/./conformer_ctc/train.py", line 819, in main() File "/cappellazzo/icefall_forked/icefall/egs/librispeech/ASR/./conformer_ctc/train.py", line 810, in main mp.spawn(run, args=(world_size, args), nprocs=world_size, join=True) File "/opt/conda/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn return start_processes(fn,...
No additional logs unfortunately. I'll try with py-spy and let you know. Btw, are there any requirements in icefall for running ddp? I can check if the server complies with...
Nope, will try with a simple pytorch ddp script then
Hi FangJun, I managed to solve the problem with ddp, basically the A40 GPUs give some problems by default, and a certain command must be pre-appended to make it work....
Also, Dan mentioned that it could be useful to turn off mmi loss during the very first epochs and just using ctc, and then switching to mmi. I remember this...
> > Also, Dan mentioned that it could be useful to turn off mmi loss during the very first epochs and just using ctc, and then switching to mmi. I...
> > Also, Dan mentioned that it could be useful to turn off mmi loss during the very first epochs and just using ctc, and then switching to mmi. I...
Quick update: the conformer_mmi recipe with a checkpoint model that used ctc loss (5 epochs) seems to work fine now, no strange errors and the curves are reasonable.
> > guys, I noticed dense_intersect has max_states and max_arcs options, that may not be used in the MMI recipe, but it seems to me we could solve this problem...
I used the following command for installing k2 but it does not install the latest version. ``` $ conda install -c k2-fsa -c pytorch -c nvidia k2 pytorch=1.13.0 pytorch-cuda=11.7 python=3.8...