Samuele Cornell
Samuele Cornell
I think they will possibly still occur because there is also clipping in some arrays in CHiME-6. Only way to prevent it is using peak normalization when the peak is...
@kamo-naoyuki, did you had a chance to try ? For me the results are the same as with the old code. But may it be because my GPUs are 40GB...
Yes but then we need to also disable the channel selection inside the beamformer when the option is disabled. Or IDK beamforming part needs to be changed quite a lot...
Hi, Did you try to change the pytorch version with conda ?
this correctly throws an error however. @simpleoier can you point to the exact recipe you were using ? kaldiio version: '2.17.2' ```python from kaldiio import ReadHelper import torchaudio import soundfile...
I also discovered that this change breaks espnet1 transformer encoder: `espnet/nets/pytorch_backend/transformer/encoder.py` since the mask would be different. Should I go forward ?
Then probably safer to just move the PR to point to ESPnet3 development branch.
> There are not so many recipes using dynamic data augmentation, and I think this will just remove some bugs. > It would not affect many others. True but even...
> I see. > Then, in the general inference case, we also need to implement online normalization processing? Yes and no ? I mean realistically I think this is an...
Yes but it is trained on 10 seconds chunks and the model is not causal. You would need to use 10 seconds windows and advance by a certain stride each...