edc3000
Results
12
comments of
edc3000
@haolinyan I think this might be the code in async_reward_agent/main_ppo.py. In the function run(), the fsdp_workers is imported by verl, not your code. I changed it like ```from .fsdp_workers import...
Can you share some scripts/examples to use genRM in reward loop?