hindsight-experience-replay issues

Can I run the training in the handmanipulate env?

1

Hi, thanks for sharing the code. I'm wondering if I can train the DDPG agent in the handmanipulate env since they are from the same robotics env group.

LingfengTao

how to set the distance_threshold

1

In the Fetchxxx-Env, the env's distance_threshold is set as 0.05 default to determine whether a task is completed successfully. I try to modify it by using `env.distance_threshold = 0.01(or other...

QAbot-zh

question about strategy for sampling goals for replay?

4

thanks a lot! tihs project works well with my own robotic environment. But I am confused about `her.her_sampler.sample_her_transitions`, because it's quite different from the strategy future as I think. ![Screenshot...

whynpt

about replay buffer

1

请问训练过程中的经验数据是在电脑硬盘还是存在内存？存在内存的话如果状态空间中有图像是不是很容易就存满了。 ---入门rl新人求解

Mickeyyyang

Why single process on Push not work

13

Hi, Tianhong, thanks for sharing the code. I've tried to run your code based on the guidance in readme ``` mpirun -np 8 python -u train.py --env-name='FetchPush-v1' 2>&1 | tee...

Ericonaldo

what is the definition of actor_loss in ddpg_agent,.py?

1

``` actor_loss = -self.critic_network(inputs_norm_tensor, actions_real).mean() actor_loss += self.args.action_l2 * (actions_real / self.env_params['action_max']).pow(2).mean() ``` I think the output of critic_network is enough to be the actor_loss. So is it a regularizer...

whynpt

FetchPickAndPlace-v1

1

Hello! I'm sorry to bother you again. I created a sub optimal policy when running ' mpirun -oversubscribe -np 16 python -u train.py --env-name='FetchPickAndPlace-v1' 2>&1 | tee pick.log'. When the...

quyouyuan

ERROR: OpenGL version 1.5 or higher required

1

Hi, when I ran python demo.py --env-name='FetchReach-v1' on a ubuntu server, I met a problem here. Creating window glfw ERROR: OpenGL version 1.5 or higher required Press Enter to exit...

ShiguangSun

get global standard deviation for plotting

1

To plot the training results I can simply use the mean value logged in .log. However, the standard deviation is not computed in _eval_agent. How did you get it?

stefanwanckel

All process update the network and then sync the grad?

1

Hi, I have the doubt. In this distributed RL, because of OS scheduling, every process will have the near state and do the same thing? so all process will update...

nizhihao

hindsight-experience-replay
hindsight-experience-replay copied to clipboard

Metadata

Can I run the training in the handmanipulate env?

how to set the distance_threshold

question about strategy for sampling goals for replay?

about replay buffer

Why single process on Push not work

what is the definition of actor_loss in ddpg_agent,.py?

FetchPickAndPlace-v1

ERROR: OpenGL version 1.5 or higher required

get global standard deviation for plotting

All process update the network and then sync the grad?

← Metadata

Owner

Metadata

hindsight-experience-replay hindsight-experience-replay copied to clipboard

Metadata

← Metadata

Owner

Metadata

hindsight-experience-replay
hindsight-experience-replay copied to clipboard