RLzoo
RLzoo copied to clipboard
Results on Box2D environments
I tried to benchmark the follwing environments ['BipedalWalker-v2', 'BipedalWalkerHardcore-v2', 'CarRacing-v0', 'LunarLander-v2', 'LunarLanderContinuous-v2'] using ['A3C', 'DDPG', 'TD3', 'SAC', 'PG', 'TRPO', 'PPO', 'DPPO'] algorithms. Most of the combinations failed to learn the task and didn't converge. Only (SAC, LunarLanderContinuous-v2) and (TD3, LunarLanderContinuous-v2) learnt the task sub-optimally. . Can someone address this issue?
Hi, Did you use the default hyper-parameters provided in RLzoo? If so, we will take a look into this problem.