Pawel Swietojanski
Pawel Swietojanski
Hi, apologies for a delay with this. We did not release these data augmentation RIRs, instead you may use the 16kHz RIRs you can get from openslr page. The results...
How much data do you train on? In our case, training for an epoch on 50 hours variant of librispeech took around 30 minutes (or perhaps under this) on a...
Not sure how Google's Colab assigns resources, but ```--num-workers 16``` only makes sense if you have access to that many CPU cores (on top of a GPU). In that case...
If it speeds things up, then sure (for our setup 16 was about OK). See what seem to be the best setting in your case (this is an overall balancing...
Well, it's clear there is a large bottleneck somewhere. It's most likely IO related due to slow disk access (i.e. reading waves, rather than augmenting them later). Where do you...
Thanks for reporting back on this. Do you have any way to tell the stats on how the machine is being used during training session? Ideally something along screen shot...
Thanks. So one more thing you want to try is to limit each data loading thread to one CPU core. (at math algebra level) Now it looks like each thread...
Looks like the overall system is much better balanced now (no race conditions, well loaded cores). How much data you pretrain on in this setup, 50 horus? You can see...