Chunjiang Ge (葛春江)

Results 39 comments of Chunjiang Ge (葛春江)

> 8B model, zero3 could run. Sorry, I mistake it for 8 40G GPUs. Maybe you should try it on 8 GPUs or 2B/4B model. Otherwise you should try LoRA.

Hello, do you still have this bug with latest code? This may not seem to be a bug correlated with our code.

会更新 lora 训练代码,请等待一下

If you want to finetune the model to align with some tasks not exist in pretrain stage, you should set a higher learning rate. If you do not want to...

Yes you could finetune 8B model with stage3. Deepspeed does not support MOE model with zero3.

1. 保证解析出来的格式一致即可 2. 可以 3. 在词表中加入特殊token,检查特殊token是否能被正确tokenize

https://github.com/TideDra/lmm-r1 This repo supports vision language model training using OpenRLHF. Maybe this does not require many modifications.

是不是设的 max output tokens 太少了

generated_ids = model.generate(**inputs, max_new_tokens=128)