SLAM-LLM issues

Deepspeed training dataset does not have sampler

### System Info torch 2.0.1 torchaudio 2.0.2 torchvision 0.15.2 ### Information - [ ] The official example scripts - [ ] My own modified scripts ### 🐛 Describe the bug...

lzl-mt

FSDP training raise "KeyError: 'ShardingStrategy.NO_SHARD'"

4

### System Info torch 2.0.1 torchaudio 2.0.2 torchvision 0.15.2 ### Information - [ ] The official example scripts - [ ] My own modified scripts ### 🐛 Describe the bug...

lzl-mt

Avoid putting a bos token before answer ？

3

Hello, I don't quite understand why bos is not added here "(example = prompt + answer # FIX(MZY): avoid putting a bos token before answer.)". How can autoregressive training be...

Alex-Songs

What data was used to train the pretrained vallex?

3

This open source project is fantastic. Is it convenient to inform which data was used to train vallex?

EsOff

Is it possible to handle speech front-end signal processing tasks?

3

### 🚀 The feature, motivation and pitch like Adaptive noise suppression，Acoustic echo cancellation，Speech Seperation task， thanks！ ### Alternatives _No response_ ### Additional context _No response_

zuowanbushiwo

Do you have any plan about Speech to Text or Speech to Speech End2End models?

6

### 🚀 The feature, motivation and pitch As we all know, GPT-4o is an end2end multi-modal models, which support Speech to Text/Speech. I have some ideas about it: 1. Speech...

Irvingao

What dataset was used to train/eval the AAC model?

1

Great work! More info about the data would be appreciated!

jasonppy

repetition 问题记录

大模型重复生成问题推理层面优化： repetition penalty 训练层面优化: eos_token: https://github.com/QwenLM/Qwen2/issues/779#issuecomment-2229890369 no_speech token: https://github.com/X-LANCE/SLAM-LLM/issues/113 模型帧率，提高帧率可以改善短音频复读机问题 LLM的文本分布引入ctc结果：https://arxiv.org/abs/2408.09491 从NLP的角度： https://zhuanlan.zhihu.com/p/672261242?utm_psn=1807773013061558274 训练数据中短文本或重复文本较多，即数据多样性不足时会触发大模型重复生成问题模型参数量越小越容易触发大模型重复生成问题欢迎补充！

fclearner

Suggestion on Add mutliple encoder

3

Hi there, I am interested in using more than 1 encoder on my speech tasks, does this framework support this feature? Currently in SLAM paper I see only one speech...

YGHacker

Training suggestion...? For reducing LLM to produce like "I am sorry, I'm an AI language model and I don't have abilty to transcribe speech to text"

7

### System Info Pytorch 2.3.1+cu121 CUDA 12.2 GPU Nvidia H100 2 machines * 8, DDP only, FP16 ### Information - [ ] The official example scripts - [X] My own...

billweasley

SLAM-LLM
SLAM-LLM copied to clipboard

Metadata

Deepspeed training dataset does not have sampler

FSDP training raise "KeyError: 'ShardingStrategy.NO_SHARD'"

Avoid putting a bos token before answer ？

What data was used to train the pretrained vallex?

Is it possible to handle speech front-end signal processing tasks?

Do you have any plan about Speech to Text or Speech to Speech End2End models?

What dataset was used to train/eval the AAC model?

repetition 问题记录

Suggestion on Add mutliple encoder

Training suggestion...? For reducing LLM to produce like "I am sorry, I'm an AI language model and I don't have abilty to transcribe speech to text"

← Metadata

Owner

Metadata

SLAM-LLM SLAM-LLM copied to clipboard

Metadata

← Metadata

Owner

Metadata

SLAM-LLM
SLAM-LLM copied to clipboard