Alireza Kazemipour
Alireza Kazemipour
Hey! I noticed that the Prioritized_Replay_DQN code does not work and it is vanilla DQN if you force the condition: if self.prioritized to: if True Then the following error appears:...
Hi, First of all, thank you very much for your easy-to-follow implementation! Very intuitive and simple. :+1: My question is about your use of LSTMCell to implement the recurrent version...
According to the DQN nature paper and [PPO1 implementation](https://github.com/openai/baselines/blob/ea25b9e8b234e6ee1bca43083f8f3cf974143998/baselines/ppo1/cnn_policy.py#L30), [this line](https://github.com/openai/random-network-distillation/blob/f75c0f1efa473d5109d487062fd8ed49ddce6634/policies/cnn_policy_param_matched.py#L104): ```python X = activ(conv(X, 'c3', nf=64, rf=4, stride=1, init_scale=np.sqrt(2), data_format=data_format)) ``` should be changed to: ```python X = activ(conv(X,...
Hi, Is there a way to make sure that changes to the main repo's code when opening a PR does not disrupt the functionality of the project? Can you add...
Hi. Could you please give an explanation about how you came up with the idea of the changes that you have applied to the reward signal? specifically I mean this...