Snowdar
Snowdar
## ❓ Questions and Help I do not plan to use srun and just start the training on two machines by hands. But how to use fairseq-hydra-train with multi-nodes? Configure...
Welcome to discuss the training strategy here. There are two typical training strategies, "SGD + Reduce Learning Rate on Plateau" and "Adam + Warm Restarts". ## SGD + Reduce Learning...
I have some dialogue data generated by the model, and I want to edit some bad sentences to make the dialogue more reasonable. But the Paragraph tag does not support...
如题,由于model dispatch attention部分未兼容新版transformers,不能更新transformers以训练llama3.1
### Checklist - [X] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. - [X]...
### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. -...