ReinforcementLearning.jl R2D2

Implementing the recurrent and distributed Rl algorithm R2D2(https://openreview.net/pdf?id=r1lyTjAqYX).

Oct 28 '20 20:10 RajGhugare19

I wanted to know in Prioritized_DQN, where are new states, rewards, terminals being extracted from the env.
Where are we sampling the batch from replay buffer
To implement multiple actors for distributed DQN, should I have a similar structure like the multithread envs for A2C?

Nov 05 '20 06:11 RajGhugare19

https://github.com/JuliaReinforcementLearning/ReinforcementLearningCore.jl/blob/df518b60103277adfb0a49f30902b96dcc16b9c2/src/components/agents/agent.jl#L103-L116

https://github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl/blob/e28836194fcd8256d2bb28d64f8bee0a788f2e69/src/algorithms/dqns/common.jl#L7
No. There will be a general implementation provided in https://github.com/JuliaReinforcementLearning/DistributedReinforcementLearning.jl . It will be very different from the MultiThreadingEnv in A2C. In short, it will be a combination of rllib and acme. The goal is that we can easily transform the experiment working on one machine to a large scale of machines. I'm still working on it so be patient please. I'd suggest you to focus on the recurrent part first. The distributed part will be supported out of the box later.

Nov 05 '20 12:11 findmyway

ReinforcementLearning.jl ReinforcementLearning.jl copied to clipboard

R2D2

ReinforcementLearning.jl
ReinforcementLearning.jl copied to clipboard