Bofeng Huang issues

Results 10 issues of


                                            Bofeng Huang

New DynamicBatchSampler

Hello @anautsch , As discussed in https://github.com/speechbrain/speechbrain/issues/1399, this is the PR for the new version of `DynamicBatchSampler` - [x] Distribute data into buckets by `log-norm` distribution fitted on the datasets...

DynamicBatchSampler

Hi, I'm trying to understand the implementation of `DynamicBatchSampler`. I would like to know, in [`_get_boundaries_through_warping`](https://github.com/speechbrain/speechbrain/blob/develop/speechbrain/dataio/sampler.py#L497), why would you use the lognorm of `s=1` to get the quantiles and linearly...

Update prep_mtedx_data.py

Hi, missing unpacked value here :) cc: @kahne

CLA Signed

Question about LayerNorm in Wav2vec

Hi @TParcollet , I have a question about the layer normalization used in wav2vec for both HF and Fairseq implementations. Why using it on all `(batch_size, sequence_length, hidden_size)` dimensions of...

question

[Whisper] Add SpecAugment

# What does this PR do? Hi @sanchit-gandhi @ArthurZucker, Thanks for pointing out the flaw in the other PR (https://github.com/huggingface/transformers/pull/21063)! Here I will add [SpecAugment](https://arxiv.org/abs/1904.08779) to [modeling_whisper.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/whisper/modeling_whisper.py) Several things have...

Bofeng Huang

New DynamicBatchSampler

DynamicBatchSampler

Update prep_mtedx_data.py

Question about LayerNorm in Wav2vec

[Whisper] Add SpecAugment

[Whisper] Add word timestamps and confidence scores

[w2v-bert] Questions about average duration by token

How to generate reference answers in MT-Bench?

[Frontend] OpenAI API server: Do not add bos token by default when encoding

Fix previous text prepending