edc3000

Results 12 comments of edc3000

@haolinyan I think this might be the code in async_reward_agent/main_ppo.py. In the function run(), the fsdp_workers is imported by verl, not your code. I changed it like ```from .fsdp_workers import...

Can you share some scripts/examples to use genRM in reward loop?