Samuele Cornell

Results 47 comments of Samuele Cornell

@lminer did you use voxceleb ?

I have observed the same actually. Also according to https://arxiv.org/abs/2202.00733 the use of speaker ID info does in fact nor really help.

I recall that we decided to offer some generic (and multi-purpose e.g. ConvTasNet, DPRNN) architectures into the toolkit but more specific ones just into the egs (e.g. WHAMR stacked Bi-LSTM...

It could be interesting to try, even if simulated RIRs are usually much more realistic than DSP-based artificial reverbs (at least open-source ones, commercial ones are another story). I think...

> But the first option is definitely possible and a preload_wavs flag or something like that could completely do the job. also maybe caching as they are read.

What do you mean by temporal masking ? Can something like this be useful (basically dropout over only the last dimension ) ? ```python class StepDrop(nn.Module): def __init__(self, p=0.5): self.p...

What is the difference wrt just resampling ? It should be pretty much the same no ?

for multi-frame MWF as defined in `https://arxiv.org/abs/1911.07953` I think there won't be a problem as basically the other frames are treated as other microphone channels. I think it should be...

I've encountered the same problem actually. I've issued a pull request which seems to have fixed that. The problem is due to the fact that in Python 3 the range...

@AsuMagic observations are correct IMO. Transformer code has lots of repetitions right now unfortunately and probably is better to refactor to reuse as much as possible stuff. Otherwise such stuff...