flops-counter.pytorch icon indicating copy to clipboard operation
flops-counter.pytorch copied to clipboard

The number of parameters on BatchNormalization module

Open hideakikuratsu opened this issue 3 years ago • 2 comments

The number of parameters of each module is calculated by following code, https://github.com/sovrasov/flops-counter.pytorch/blob/5f2a45f8ff117ce5ad34a466270f4774edd73379/ptflops/pytorch_engine.py#L110-L112 I used this code on torch.nn.BatchNorm2d like this import torch bn = torch.nn.BatchNorm2d(10) sum(p.numel() for p in bn.parameters() if p.requires_grad) Last line returns 20, but torch.nn.BatchNorm2d also has running (moving) mean and variance as parameters, doesn't it? so I thought the correct number of parameters on torch.nn.BatchNorm2d(10) is the number of weight parameters = 10 the number of bias parameters = 10 the number of running mean parameters = 10 the number of running var parameters = 10 that is, 10 * 4 = 40. so I'm appreciated if you explain this! thank you!

hideakikuratsu avatar Feb 28 '22 05:02 hideakikuratsu

Hi @hello-friend1242954 Weight and bias are parameters in the BN layer (they are updated during the back propagation). Running mean and variance are calculated during the forward pass, that's why, I think, they are not considered as parameters (since they do not require gradient). https://d2l.ai/chapter_convolutional-modern/batch-norm.html#training-deep-networks

morkovka1337 avatar Jun 15 '22 13:06 morkovka1337

Thank you for the reply! I agree that we have to judge whether they are counted as parameters or not by considering if they require gradient or not, but they also undoubtedly take up some static memory/storage spaces, right? So I thought the definition of 'parameters' is very ambiguous. Thank you for clear explanation!!

hideakikuratsu avatar Jun 15 '22 15:06 hideakikuratsu