maddpg question about p_reg in p

question about p_reg in p_train

Open yeshenpy opened this issue 4 years ago • 0 comments

I went through the code and found a problem I didn't understand. I think of p_reg as a regular term, and the regular term as a constraint on the learning parameters. But I found that the act_Pd. flatparam() in the code p_reg = TF.reduce_mean (TF.square (act_pd.flatparam())) gets the network output, that is to say, the return of the flatparam function is not the learning parameters,Instead , It's network output How to explain this regularization.This confuses me and I look forward to your advice. for example of act_Pd. flatparam() :

Jun 05 '20 14:06 yeshenpy

maddpg maddpg copied to clipboard

question about p_reg in p_train

maddpg
maddpg copied to clipboard