fast-reinforcement-learning
fast-reinforcement-learning copied to clipboard
DDPG / HER
Discussion Now PER is added, many of the basic DQNs are added. We move to DDPG. Primary goals:
- [x] DDPG algorithm for continuous envs
- [x] A way to get DDPG to work in a Discrete env like the maze for debugging.
- [ ] HER: Hindsight Experience Replay is useful for speeding up learning / improve exploration. This will possibly involve some considerations in regards to data bunches and the MDPSlices.
- [x] Once these are done, is there a way to unify this under a single fit function?