Bofeng Huang

Results 10 issues of Bofeng Huang

Hello @anautsch , As discussed in https://github.com/speechbrain/speechbrain/issues/1399, this is the PR for the new version of `DynamicBatchSampler` - [x] Distribute data into buckets by `log-norm` distribution fitted on the datasets...

Hi, I'm trying to understand the implementation of `DynamicBatchSampler`. I would like to know, in [`_get_boundaries_through_warping`](https://github.com/speechbrain/speechbrain/blob/develop/speechbrain/dataio/sampler.py#L497), why would you use the lognorm of `s=1` to get the quantiles and linearly...

Hi, missing unpacked value here :) cc: @kahne

CLA Signed

Hi @TParcollet , I have a question about the layer normalization used in wav2vec for both HF and Fairseq implementations. Why using it on all `(batch_size, sequence_length, hidden_size)` dimensions of...

question

# What does this PR do? Hi @sanchit-gandhi @ArthurZucker, Thanks for pointing out the flaw in the other PR (https://github.com/huggingface/transformers/pull/21063)! Here I will add [SpecAugment](https://arxiv.org/abs/1904.08779) to [modeling_whisper.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/whisper/modeling_whisper.py) Several things have...

Hi @awni 👋 I've tried to add several new features to the Whisper implementation through this PR, following the implementation of the original repository: - word-level timestamps (https://github.com/openai/whisper/pull/869) - word-level...

Hi @ylacombe, Thank you for the new blog post about fine-tuning w2v-BERT. However, I have some doubts about the "average duration seen by each token", or perhaps I might be...

Hi! Thank you for your excellent work on the LLM evaluation! I'm inspired to create a French version of MT-Bench. Currently, I'm in the process of generating reference answers for...

Hi, I noticed that the BOS token is always duplicated when running with the OpenAI API server. As shown in the console output below when launching Meta-Llama-3-8B-Instruct, there are two...

Hi 👋, Thank you for continuously adding more features to the Whisper distillation code! As I reviewed the section on prepending previous text during the preparation of training data, I...