Stewart Slocum

Results 2 comments of Stewart Slocum

Thanks for the response! I'm relatively new to RL and didn't know that infinite horizon environments could cause issues with learning and with the standard logger and eval setup. My...

Yes, the rl-teacher repo is in line with the RLHF paper in that they explicitly turn Mujoco environments into fixed-length episodes. As far as I can tell, the original paper...