maddpg icon indicating copy to clipboard operation
maddpg copied to clipboard

NaN

Open farice opened this issue 7 years ago • 5 comments

After a sufficient number of steps, I'll get an array full of nans when running your sample script. Have you seen this before?

Action: [array([ nan,  nan,  nan,  nan,  nan]), array([ nan,  nan,  nan,  nan,  nan]), array([ nan,  nan,  nan,  nan,  nan]), array([ nan,  nan,  nan,  nan,  nan])] 

farice avatar Dec 20 '17 02:12 farice

Yes, that's a list of the actions going to each agent, which I printed out for debugging purposes. I have seen the actions all go to NaN values, and haven't yet found a reason (or solution) to that. If you do, please update this thread!

agakshat avatar Dec 20 '17 17:12 agakshat

I determined that something is going wrong with the numerical computation of relu.

Change all instances of relu to tanh or sigmoid as hidden layer neurons and you won’t see that error any longer.

farice avatar Dec 20 '17 17:12 farice

Did you try to use the relu with the batch normalization?

It looks that now NaN have disappeared.

emanuelepesce avatar Jan 25 '18 14:01 emanuelepesce

@farice Changing relu to tanh does not improve the situation for me, rewards still go to NaN after a while. Did you make any other changes as well?

agakshat avatar Jan 25 '18 16:01 agakshat

If I change relu to tanh, the training has been running for almost 3 days and NaN have not appeared (neither in the actions or rewards).

Looks like it also works when I use the relu, but it needs adding the batch normalization.

emanuelepesce avatar Jan 25 '18 16:01 emanuelepesce