Policy-Gradient-Methods
Policy-Gradient-Methods copied to clipboard
Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC
Results
2
Policy-Gradient-Methods issues
Sort by
recently updated
recently updated
newest added
I find you write: self.noise = OUNoise.... but you didn't add the noise to the action?
Could you give reference to paper as to why you chose to make two soft-q networks because they are independently working and you are taking the minimum of both while...