Zhoujian Sun
Results
1
issues of
Zhoujian Sun
I am currently using verl for multi-turn interaction RL training and have identified two potential issues. - There might be a problem with the usage of _req.add_assistant_message in the script...