Samuele Cornell comments

Results 70 comments of


                                            Samuele Cornell

[Fix] Use number of channels when calculating BAN

I think they will possibly still occur because there is also clipping in some arrays in CHiME-6. Only way to prevent it is using peak normalization when the peak is...

do not chunk when performing beamforming

@kamo-naoyuki, did you had a chance to try ? For me the results are the same as with the old code. But may it be because my GPUs are 40GB...

do not chunk when performing beamforming

Yes but then we need to also disable the channel selection inside the beamformer when the option is disabled. Or IDK beamforming part needs to be changed quite a lot...

chime7_task1:diar_asr

Hi, Did you try to change the pytorch version with conda ?

Raise an error for non int16 encoded .wav files with kaldiio feature extraction

this correctly throws an error however. @simpleoier can you point to the exact recipe you were using ? kaldiio version: '2.17.2' ```python from kaldiio import ReadHelper import torchaudio import soundfile...

Update attention.py, using SDPA by default

I also discovered that this change breaks espnet1 transformer encoder: `espnet/nets/pytorch_backend/transformer/encoder.py` since the mask would be different. Should I go forward ?

Update attention.py, using SDPA by default

Then probably safer to just move the PR to point to ESPnet3 development branch.

Mean normalization issue with commonpreprocessor when using data augmentation

> There are not so many recipes using dynamic data augmentation, and I think this will just remove some bugs. > It would not affect many others. True but even...

Mean normalization issue with commonpreprocessor when using data augmentation

> I see. > Then, in the general inference case, we also need to implement online normalization processing? Yes and no ? I mean realistically I think this is an...

Test baseline on audio stream

Yes but it is trained on 10 seconds chunks and the model is not causal. You would need to use 10 seconds windows and advance by a certain stride each...