supervised-reptile
supervised-reptile copied to clipboard
Seems that reptile produce similar gridients as vanilla SGD
I run the sine code and print out the outerloop updated weight without multiply the outerstepsize
and the nomal SGD weight, and they are same, even I set the innerepochs
bigger than 1.
and form the reptile algorithm, we can see that
,so the only difference is the out loop multiply an epsilon
, it is only a learning rate defferent with SGD.