maddpg icon indicating copy to clipboard operation
maddpg copied to clipboard

Q divergence

Open rbrigden opened this issue 6 years ago • 0 comments

Hello! I am working to implement MADDPG in pytorch based on the details of this implementation in tensorflow. I have followed the implementation to a tee, but I when I remove regularization on the policy logits, my Q values diverge. When I remove the same regularization term in your implementation, this does not occur. Did you experience this divergence issue? Was a matter of tuning to fix or does this indicate an issue with my implementation? Thank you.

rbrigden avatar Jun 05 '18 22:06 rbrigden