backpack
backpack copied to clipboard
Support for LayerNorm
I was trying to extend a Vision Transformer model using backpack. However, I encounter the following error:
UserWarning: Extension saving to grad_batch does not have an extension for Module <class 'torch.nn.modules.normalization.LayerNorm'> although the module has parameters
I know that torch.nn.BatchNormNd leads to ill-defined first-order quantities and hence it is not implemented here. Does the same hold for Layer Normalization?
Thank you in advance!
Hi,
thanks for your question. The exception you get for LayerNorm
is because BackPACK currently does not support it.
In contrast to BatchNorm
however, this layer treats each sample in a mini-batch independently (mean and variance for the normalization for a sample are computed along its feature dimensions; for BN they are computed along the batch dimension). Hence, first-order quantities like individual gradients are defined.
To add support for LayerNorm
, the following example from the documentation is a good starting point. It describes how to write BackPACK extensions for new layers (the "Custom module extension" is the most relevant).
I'd be happy to help merging a PR.
Best, Felix
Any update on this?
No progress, and I don't have capacities to work on this feature.
To break things further down, adding limited support for LayerNorm
, e.g. only the BatchGrad
extension, would be a feasible starting point. This can be achieved by following the above example in the docs.