Samuele Cornell
Samuele Cornell
Hi, your tag did not work so I ve missed this issue sorry, thanks to @anautsch that pinged me. Does it happens only with Dynamic Mixing ? Or also with...
I can look a bit at the architecture, I am not really familiar with ST
I am not aware of any native k-means implementation on torch but there are tons out-there. E.g. look at https://github.com/lucidrains/vector-quantize-pytorch/blob/d380862ea2c01093f72b6c2e884e2d611e6b8552/vector_quantize_pytorch/vector_quantize_pytorch.py#L51
> But the WERs are just bad, I think this is because of the way we constructed the batches, filling up by max_batch_len generates much fewer batches and so steps....
I also encountered this error. I tried to fix it, and it is quite easy to do so. However I got much worse performance for vanilla transformer. In my understanding...
@vadimkantorov the model didn't converge and around maximum lr by lr_scheduling I got nans
Yes this is unfortunately a common problem with Audioset. Some videos have been pulled off some were cancelled by original uploaders...
I see there is also a rather convenient FFTConv layer which is handy IMO. Apart from this, is it the same than torchaudiomentation` convolve` ? https://github.com/asteroid-team/torch-audiomentations/blob/71c27d8206a0850640525b97a7241eb4625d755b/torch_audiomentations/utils/convolution.py#L40
Did you download the embeddings and set the paths accordingly ? Can you try if the Dataset objects runs correctly without any DataLoader ?
Any news on this issue ?