Policy-Gradient-Methods icon indicating copy to clipboard operation
Policy-Gradient-Methods copied to clipboard

Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC

Results 2 Policy-Gradient-Methods issues
Sort by recently updated
recently updated
newest added

I find you write: self.noise = OUNoise.... but you didn't add the noise to the action?

Could you give reference to paper as to why you chose to make two soft-q networks because they are independently working and you are taking the minimum of both while...