hindsight-experience-replay
hindsight-experience-replay copied to clipboard
This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.
Hi, thanks for sharing the code. I'm wondering if I can train the DDPG agent in the handmanipulate env since they are from the same robotics env group.
In the Fetchxxx-Env, the env's distance_threshold is set as 0.05 default to determine whether a task is completed successfully. I try to modify it by using `env.distance_threshold = 0.01(or other...
thanks a lot! tihs project works well with my own robotic environment. But I am confused about `her.her_sampler.sample_her_transitions`, because it's quite different from the strategy future as I think. ![Screenshot...
请问训练过程中的经验数据是在电脑硬盘还是存在内存?存在内存的话如果状态空间中有图像是不是很容易就存满了。 ---入门rl新人求解
Hi, Tianhong, thanks for sharing the code. I've tried to run your code based on the guidance in readme ``` mpirun -np 8 python -u train.py --env-name='FetchPush-v1' 2>&1 | tee...
``` actor_loss = -self.critic_network(inputs_norm_tensor, actions_real).mean() actor_loss += self.args.action_l2 * (actions_real / self.env_params['action_max']).pow(2).mean() ``` I think the output of critic_network is enough to be the actor_loss. So is it a regularizer...
Hello! I'm sorry to bother you again. I created a sub optimal policy when running ' mpirun -oversubscribe -np 16 python -u train.py --env-name='FetchPickAndPlace-v1' 2>&1 | tee pick.log'. When the...
Hi, when I ran python demo.py --env-name='FetchReach-v1' on a ubuntu server, I met a problem here. Creating window glfw ERROR: OpenGL version 1.5 or higher required Press Enter to exit...
To plot the training results I can simply use the mean value logged in .log. However, the standard deviation is not computed in _eval_agent. How did you get it?
Hi, I have the doubt. In this distributed RL, because of OS scheduling, every process will have the near state and do the same thing? so all process will update...