maddpg
maddpg copied to clipboard
NaN
After a sufficient number of steps, I'll get an array full of nans when running your sample script. Have you seen this before?
Action: [array([ nan, nan, nan, nan, nan]), array([ nan, nan, nan, nan, nan]), array([ nan, nan, nan, nan, nan]), array([ nan, nan, nan, nan, nan])]
Yes, that's a list of the actions going to each agent, which I printed out for debugging purposes. I have seen the actions all go to NaN values, and haven't yet found a reason (or solution) to that. If you do, please update this thread!
I determined that something is going wrong with the numerical computation of relu.
Change all instances of relu to tanh or sigmoid as hidden layer neurons and you won’t see that error any longer.
Did you try to use the relu with the batch normalization?
It looks that now NaN have disappeared.
@farice Changing relu to tanh does not improve the situation for me, rewards still go to NaN after a while. Did you make any other changes as well?
If I change relu to tanh, the training has been running for almost 3 days and NaN have not appeared (neither in the actions or rewards).
Looks like it also works when I use the relu, but it needs adding the batch normalization.