DQN implementation that supports continuous action spaces (NAF)
I would like to modify the DQN.py in order to make it work with a continuous action space (spaces.Box from Gym library). This looks like a huge project to me, and I take any advices / ideas that could help me better understanding how stable-baselines is build.
Hello, the paper your are looking for is maybe Continuous Deep Q-Learning with Model-based Acceleration with the Normalized Advantage Function (NAF) see notes in Keras RL.
You also have to know that DDPG was meant to be DQN with continuous actions.
any advices / ideas that could help me better understanding how stable-baselines is build.
well, for now, read the source code, read the paper (several times) and read some implementations that can already found on github ;) You can find some more advices here
Thank you for your feedback @araffin, will check it soon and let you know how it's doing.