yu_wang comments

Repositories
Issues
Comments

Results 2 comments of


                                            yu_wang

请问grpo的lora训练完成后，保存在actor路径下的.pt文件是已经合并lora之后的模型参数吗？

同样在这里比较困惑。我觉得是需要使用verl框架下面的model merger合并权重文件变成safetensor格式的。但是我有疑问：（1）lora和fsdp是可以一起使用的吗？（2）合并之后的safetensor格式文件是lora adaptor模型吗？但是没有adaptor_config.json这个文件。

请问grpo的lora训练完成后，保存在actor路径下的.pt文件是已经合并lora之后的模型参数吗？

楼主有没有发现lora+grpo/reinforce++训练速度很慢？这个帖子有提到：https://github.com/volcengine/verl/issues/3115