Cesc
Cesc
I test my 1000h-data streaming model, the perfermance is bad because lower decoder layers make very confused alignments, now I get the idea of so called "attention heads pruning in...
definitively
@bo-son I think you're right
> Also, would you mind creating a PR to fix the outdated code? Sure thing, I'll do it later.
Thanks for the reply, so I could just skip making fbank stages in prepare.sh?
Thanks mate.
@csukuangfj I found on-the-fly feats computation makes training much slower, for example it cost 20 seconds using pre computed kaldi fbank feats for 50 batch iteration and it took about...
> Are you using raw waves? Also, is your disk fast? Yes I'm using raw waves and how to check my disk is fast or slow?
BTW, I've trained using raw waves with Espnet, the gpu utility is around 70% which I think is normal , the difference is in Espnet I implement Fbank as a...
> Can you try increasing the number of dataloader workers? Perhaps that’s the bottleneck. > > If you want to use fbank as a layer you can modify the code...