gnp icon indicating copy to clipboard operation
gnp copied to clipboard

Question about figure3 in the paper

Open JayC1208 opened this issue 6 months ago • 1 comments

Hi, I am walking through the experiments via codes, and find hard to understand the result of Figure 3 in "Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning".

In this figure, the gradient norm when $\alpha$ is 0.8 is above zero and seems bit large compared to other cases, while the testing error rate remains low. Your paper suggest that it is good to have smaller gradient norm as it indicates flat minima and this result is counter intuitive.

Can you give more explanation about this?

JayC1208 avatar Aug 07 '24 05:08 JayC1208