rl-agents icon indicating copy to clipboard operation
rl-agents copied to clipboard

rl-agents compatible with continuous action spaces

Open SHITIANYU-hue opened this issue 3 years ago • 3 comments

I am wondering is cross entropy method the only one that is compatible with continuous action spaces?

I tried CEM agent, but I found it runs very slow(Animation update is very slow), how could I increase the running speed? Thanks😁

SHITIANYU-hue avatar Oct 13 '20 04:10 SHITIANYU-hue

Hi, Yes unfortunately CEM is the only method implemented that handles continuous actions, due to the fact that my own work rather focuses on discrete actions. To increase the running speed, the first step would be to use parallel computing, since CEM is very easy to parallelize. That should help if you have many CPUs available. Another possibility is to use the GPU, see e.g. this tutorial in which a forward dynamical model is trained in pytorch. The advantage is that this model can then be used to forward 100 trajectories in parallel, in a single GPU forward pass, which is much faster (see the CEM reimplementation at the bottom of the notebook).

Another possibility is to use policy gradients algorithms (DDPG, PPO, etc) that handle continuous actions, from other libraries like stable baselines (example).

Finally, I would like to try implementing tree-based planning algorithms with continuous actions (that are more sample-efficient than CEM), like the MCTS with progressive widening or SOPC algorithms, but I wont' have the time in the next few weeks.

eleurent avatar Oct 14 '20 09:10 eleurent

Thanks for your reply and valuable suggestions!

SHITIANYU-hue avatar Oct 14 '20 14:10 SHITIANYU-hue

Hello, I tried the stable baselines with DDPG algorithm, but I found the agent can't learn a reasonable policy. The agent just drives around in a circle. here is the code(I just directly use their packages):

image

SHITIANYU-hue avatar Oct 26 '20 04:10 SHITIANYU-hue