ARS
ARS copied to clipboard
Training on BipedalWalkerHardcore seems to result in a negative reward
Hi and thanks for sharing the code.
I've tried to run the training process on a different environment such as the BipedalWalkerHardcore-v2
but it seems that is not able to learn anything. I even tried with different shift
values as noted in the code comments but still in the end I get a negative reward. Should we train for longer or there any hyperparams that we are missing?
Hey @kirk86 , I am having similar issue did you solve it? Do look at this thread for my exact issue.