Neo Zhang

Results 3 comments of Neo Zhang

可以尝试将使用KL散度设置为False,我试过能大量降低System Memory Utilization `actor_rollout_ref.actor.use_kl_loss=False `

想请问一下,可以将被训练的基座模型与global_step_xx/actor/lora_adapter下的LoRA直接合并吗?没有必要将.pt文件转为hf格式之后再合并lora权重吧

I would like to inquire about the latest progress of this project. Does R3 support the training of Megatron+vLLM? Does Megatron need to use the PR version you submitted: https://github.com/NVIDIA/Megatron-LM/pull/2101/files?