Parcollet Titouan comments

Results 183 comments of


                                            Parcollet Titouan

Beam search too slow on the transducer-conformer recipe

It's not slow ... it is slow AS FUCK. But I believe that this is expected from python only RNN6T decoders.

Incorrect transformer mask size

Hello @egaznep I am not sure to understand the issue here. Could you provide a code snippet showing explicitly the error? The function length_to_mask() is expected to provide masks containing...

Incorrect transformer mask size

Hello thanks. SpeechBrain padding is relative to the batch, not the dataset. The max len of wav_lens is the max len of the batch.

Incorrect transformer mask size

@Gastron correct me if I am wrong, but as far as I know, DDP sampler is per-process, hence the padding should be relative to the batch of each process. @egaznep...

Can't reproduce pretraining results for Wav2vec2 using LibriSpeech recipe

Hi, it's important that the total batch size corresponds to roughly 1.6h. By changing the gradient accumulation factor your can adjust this.

Can't reproduce pretraining results for Wav2vec2 using LibriSpeech recipe

@Adel-Moumen i see that the gradient accumulation factor is missing on this recipe. Could you add it? (No need to PR imho push directly to develop). @GasserElbanna have a look...

Can't reproduce pretraining results for Wav2vec2 using LibriSpeech recipe

fp16 or bf16 would make the training much faster if you have a compatible GPU.

Adding adapters to SpeechBrain (Code from Samsung AI Center Cambridge)

It does allow for stop and restart because you are altering the object i.e. the checkpointer keeps track of it! The only problem is indeed that you store the whole...

Adding adapters to SpeechBrain (Code from Samsung AI Center Cambridge)

You mean depend on another Huggingface library?

Adding adapters to SpeechBrain (Code from Samsung AI Center Cambridge)

If you could give me a neat example of an integration of PEFT, I could be convinced.