cockpit icon indicating copy to clipboard operation
cockpit copied to clipboard

Gradient analysis from named parameters without backpack

Open jogardi opened this issue 2 years ago • 1 comments

I have cockpit mostly working except I get this warnings like this because I have not implemented backpack's batch grad for all my modules:

/usr/local/lib/python3.9/site-packages/backpack/extensions/backprop_extension.py:106: UserWarning: Extension saving to grad_batch does not have an extension for Module <class 'pycompress.model_parts.PredHist'> although the module has parameters 

So as a result I am not able to get all the quantities without error. I understand backpack is central to the design of cockpit but I am fine with some things not working when I just want to prototype quickly.

My questions/requests are:

  1. I still see the gradient histogram. Is that histogram only including the gradients for modules where the backpack extension worked or is it all of them?
  2. Have you thought about integrating the much smaller project https://github.com/alwynmathew/gradflow-check. They provide a chart that lets you check for vanishing gradient and they do it without backpack based on just having the named_parameters
  3. Unrelated but btw: Have you thought about including the ESD of the weight matrices as done in? https://github.com/CalculatedContent/WeightWatcher

Thank you for your excellent work on this project!

jogardi avatar Apr 01 '22 03:04 jogardi

Hi @jogardi, thanks for your questions!

one way to deal with parameterized layers that are unsupported by BackPACK is to not extend them and not pass them to the Cockpit constructor. This workaround will fix all quantities that use first-order information. Quantities that rely on BackPACK's second-order extensions still won't work, so you have to configure your Cockpit without them.

  • Re 1.: My expectation would be that, if you passed the parameters of PredHist to the Cockpit(...) constructor via params, the gradient histogram cannot be evaluated because those parameters won't get individual gradients during backpropagation. Could you post an MWE that demonstrates the behavior you get?

  • Re 2 & 3: Thanks for the interesting pointers! We will take a look. If you're interested in submitting a PR, we can provide the relevant pointers in cockpit's code base.

Best, Felix

f-dangel avatar Apr 13 '22 10:04 f-dangel

I am closing this issue as it is stale. Please feel free to open it again if you have further questions.

fsschneider avatar Sep 16 '22 10:09 fsschneider