ReinforcementLearning.jl icon indicating copy to clipboard operation
ReinforcementLearning.jl copied to clipboard

Flux as service

Open findmyway opened this issue 5 years ago • 1 comments

In current design of distributed rl, each worker creates an independent model and make predictions separately. A better solution might be that workers on the same node share some common models. The potential benefits are :

  • Fewer models to update once received the LoadParamsMsg
  • Batch evaluation (on GPU) will be much faster
  • This module will be useful to any general Flux models (especially MCTS and DeepCFR related algorithms in RL)

findmyway avatar Nov 24 '20 02:11 findmyway

It's a good idea.

There's an implementation called GA3C:

code

https://github.com/NVlabs/GA3C

paper

  • https://arxiv.org/abs/1611.06256
  • https://on-demand.gputechconf.com/gtc/2017/presentation/s7169-Iuki-Frosio-a3C-for-deep-reinforcement-learning.pdf

norci avatar Dec 12 '20 14:12 norci