PyHessian icon indicating copy to clipboard operation
PyHessian copied to clipboard

Hi, a question of inconsistency of the dymamics of tr(FIM) and tr(H).

Open wizard1203 opened this issue 4 years ago • 0 comments

Hi, thanks for your awesome work!

I noticed that the results in the paper: PYHESSIAN: Neural Networks Through the Lens of the Hessian, the tr(H) keeps increasing during training.

image

And in this paper: Hessian-based Analysis of Large Batch Training and Robustness to Adversaries, the dominant eigenvalue of the Hessian w.r.t weights could decrease during small-batch training.

image

And in this paper: CRITICAL LEARNING PERIODS IN DEEP NETWORKS. The trace of FIM increases first, and decrease. image

Are there some relationships between them? Are they inconsistent from others?

wizard1203 avatar Dec 11 '21 02:12 wizard1203