Stewart Slocum
Results
2
comments of
Stewart Slocum
Thanks for the response! I'm relatively new to RL and didn't know that infinite horizon environments could cause issues with learning and with the standard logger and eval setup. My...
Yes, the rl-teacher repo is in line with the RLHF paper in that they explicitly turn Mujoco environments into fixed-length episodes. As far as I can tell, the original paper...