BakerBunker

Results 11 issues of BakerBunker

### 🐛 Describe the bug Batch inference with WavLM triggers AssertionError in `WavLMSelfAttention` module. ```python import torchaudio wavlm=torchaudio.pipelines.WAVLM_LARGE.get_model().cuda() wavlm.extract_features(torch.randn(2,16000,device='cuda'),lengths=torch.tensor([2000,3000],device='cuda'),num_layers=1) ``` Log: ``` AssertionError Traceback (most recent call last) [](https://localhost:8080/#) in...

Probably because https://github.com/pytorch/pytorch/issues/96742

## ❓ Questions In the [paper](https://arxiv.org/abs/2210.13438) 3.4 "Discriminative Loss" section, adversarial loss is constructed as $l_g(\hat{x})=\mathbb{E}[max(0,1-D_k(\hat{x}))]$, but in the [original hinge loss paper](http://arxiv.org/abs/1705.02894v2), adversarial loss is constructed as $-\mathbb{E}[D(\hat{x})]$. So...

question

歌曲分享: 分享ピノキオピー/初音ミク的单曲《ヨヅリナ (YOZURINA)》: http://music.163.com/song/1347529118/?userid=93852964 (来自@网易云音乐) 太极 5.1.10 音量增强器 0.0.33 在qq音乐可以免费听 qq音乐id:213515367 请问可以解决吗

## 🐛 Bug Failed to run this [demo](https://github.com/alibaba-damo-academy/FunASR/blob/main/examples/industrial_data_pretraining/whisper/demo.py) in google colab environment. ### To Reproduce Steps to reproduce the behavior: 1. Open Colab 2. run the following code #### Code...

bug

I have noticed that there are two tokenizer dicts in utils/g2p , bpe_1024 and bpe_69, which is more suitable for generation task in your actual training practice? thank you.

What is the difference between GradNorm paper and [Encodec loss balancer](https://github.com/facebookresearch/encodec/blob/main/encodec/balancer.py)? A [recent NVIDIA paper](https://arxiv.org/pdf/2402.00892.pdf) approved the effectiveness of encodec loss balancer in section 4.4.5. And could this library directly...

Hello, thank you for sharing this nice work Have you tried using shared codebook just like the method used in [VAR](https://github.com/FoundationVision/VAR) And here is the discussion about the codebook in...