ReinforcementLearning.jl
ReinforcementLearning.jl copied to clipboard
Flux as service
In current design of distributed rl, each worker creates an independent model and make predictions separately. A better solution might be that workers on the same node share some common models. The potential benefits are :
- Fewer models to update once received the LoadParamsMsg
- Batch evaluation (on GPU) will be much faster
- This module will be useful to any general Flux models (especially MCTS and DeepCFR related algorithms in RL)
It's a good idea.
There's an implementation called GA3C:
code
https://github.com/NVlabs/GA3C
paper
- https://arxiv.org/abs/1611.06256
- https://on-demand.gputechconf.com/gtc/2017/presentation/s7169-Iuki-Frosio-a3C-for-deep-reinforcement-learning.pdf