backpack
backpack copied to clipboard
AttributeError: 'BatchNorm2d' object has no attribute 'output'
I post the full error below. The MWE is a bit long (currently hundreds of lines) and I am still working on it, but is there any specific direction I should be looking at given this error? It looks like Batchnorm is somehow mixed up in the gradient calculation (judging from the error message)?
Traceback (most recent call last):
File "/Users/qiyaowei/DEQ-BNN/mwe.py", line 575, in <module>
model(torch.rand(1,3,32,32)).sum().backward()
File "/Users/qiyaowei/miniconda3/envs/jax/lib/python3.8/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/Users/qiyaowei/miniconda3/envs/jax/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/Users/qiyaowei/miniconda3/envs/jax/lib/python3.8/site-packages/torch/utils/hooks.py", line 110, in hook
res = user_hook(self.module, grad_input, self.grad_outputs)
File "/Users/qiyaowei/miniconda3/envs/jax/lib/python3.8/site-packages/backpack/__init__.py", line 209, in hook_run_extensions
backpack_extension(module, g_inp, g_out)
File "/Users/qiyaowei/miniconda3/envs/jax/lib/python3.8/site-packages/backpack/extensions/backprop_extension.py", line 127, in __call__
module_extension(self, module, g_inp, g_out)
File "/Users/qiyaowei/miniconda3/envs/jax/lib/python3.8/site-packages/backpack/extensions/module_extension.py", line 97, in __call__
delete_old_quantities = not self.__should_retain_backproped_quantities(module)
File "/Users/qiyaowei/miniconda3/envs/jax/lib/python3.8/site-packages/backpack/extensions/module_extension.py", line 162, in __should_retain_backproped_quantities
is_a_leaf = module.output.grad_fn is None
File "/Users/qiyaowei/miniconda3/envs/jax/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1185, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'BatchNorm2d' object has no attribute 'output'
Hi Qiyao,
Beware of BatchNorm
; most of the quantities returned by BackPACK are not defined when there's a batchnorm layer in the middle (see e.g. https://github.com/f-dangel/backpack/issues/239).
Easy checks of things that can cause something like this to happen would be calling backward
twice (where the first backward clears the graph, and the second backward
then crashes), and maybe missing a call to backpack.extend(model)
. But this doesn't seem to be the case here.
Only a rough guess looking at the stack, but the error might be specific to BatchNorm.
The error occurs after the computation of the backward pass, during cleanup.
delete_old_quantities = not self.__should_retain_backproped_quantities(module)
. The error 'BatchNorm2d' object has no attribute 'output'
indicates that the extension needed to store additional quantities during the forward pass (the output of the layer) but did not. This is weird; I would expect it to crash much earlier. What extension are you running with batchnorm?
- hmmm, is there currently an alternative to BatchNorm? I guess it would be safest to just stick to linear and conv layers + activations, although the accuracy will for sure decrease in that case
- Yeah I don't think I am calling backward twice, and I made sure to add model = extend(model). BTW, the documentation page also recommended trying use_converter=True, but I guess that one has its own bugs so I did not dig deeper.
- Even though I don't have the full MWE ready, the error code is easy to share
model = get_cls_net()
model = extend(model)
with backpack(BatchGrad()):
model(torch.rand(1,3,32,32)).sum().backward()
The weird thing is, BatchNorm worked with this code when I was trying it on a smaller model, so what I am doing right now is trying to sort out the structural differences between these two models and see if I can find anything useful
is there currently an alternative to BatchNorm?
There are, for example GroupNorm or LayerNorm (see https://pytorch.org/docs/stable/nn.html#normalization-layers). The problem with BatchNorm is that there are no "individual gradient"; it is not possible to isolate the contribution of one sample to the loss because BatchNorm mixes them.
What's the model (get_cls_net
)?
Oh I thought backpack doesn't support GroupNorm
BTW I might have figured out the issue, it goes away when I do add an eval like: extend(model).eval(). Not sure why but I guess that is a fix!