PapersAnalysis
PapersAnalysis copied to clipboard
Paper Read - Negative eigenvalues of the Hessian in deep neural networks
Overview
Reading Negative eigenvalues of the Hessian in deep neural networks
Abstract
The loss function of deep networks is known to be non-convex but the precise nature of this nonconvexity is still an active area of research. In this work, we study the loss landscape of deep networks through the eigendecompositions of their Hessian matrix. In particular, we examine how important the negative eigenvalues are and the benefits one can observe in handling them appropriately.
-
DNN explainability involves understanding the loss function
-
What tools can be used to learn more about the loss function?
-
In this paper, the proposed tool is Eigendecomposition of the Hessian Matrix to infer connections between the eigenvalues and eigenvectors and the training related dynamic behaviour
-
Furthermore in the abstract it implicitly says the currently used training algo are suboptimal for the actual DNNs loss function as they do not handle negative eigenvalues properly hence a strategy to deal with them properly is proposed