Torch-Pruning icon indicating copy to clipboard operation
Torch-Pruning copied to clipboard

After pruning, the Resnet18 has negative channel (RuntimeError: Given groups=1, expected weight to be at least 1 at dimension 0, but got weight of size [0, 119, 3, 3] instead)

Open Coderx7 opened this issue 5 years ago • 5 comments

Hi, This is a followup to the previous issue #7 which was fixed earlier. However, After the model is successfully pruned (The parameters are down to 4.7M from the initial 21.8M the model fails to do a forward pass and I get this error :

Traceback (most recent call last):
  File "d:\Codes\face\python\FV\Pruning\prune.py", line 67, in <module>
    out = model(img_fake)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 726, in _call_impl
    result = self.forward(*input, **kwargs)
  File "d:\codes\face\python\FV\models.py", line 212, in forward
    x = self.layer4(x)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 726, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch\nn\modules\container.py", line 117, in forward
    input = module(input)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 726, in _call_impl
    result = self.forward(*input, **kwargs)
  File "d:\codes\face\python\FV\models.py", line 139, in forward
    out = self.conv1(out)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch\nn\modules\module.py", line 726, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch\nn\modules\conv.py", line 419, in forward
    return self._conv_forward(input, self.weight)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch\nn\modules\conv.py", line 416, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, expected weight to be at least 1 at dimension 0, but got weight of size [0, 119, 3, 3] instead

By looking at the model after the pruning I noticed :

      (bn0): BatchNorm2d(119, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv1): Conv2d(119, -105, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(-105, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (prelu): PReLU(num_parameters=1)
      (conv2): Conv2d(-105, 151, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(151, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(119, 151, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(151, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)

The conv2d has negative output/input channels, so does the BatchNorm! and This is causing the issue it seems.

Coderx7 avatar Jul 21 '20 04:07 Coderx7

I noticed the channel numbers are governed by the torch_pruning/prune/structured.py and simply using the abs() around them would solve this. However, after doing so, I get errors complaining about the BatchNorm in the prune_batchnorm() when we try to run the forward pass:

Traceback (most recent call last):
  File "d:\Codes\face\python\FV\Pruning\prune.py", line 62, in <module>
    model = prune_model(model)
  File "d:\Codes\face\python\FV\Pruning\prune.py", line 50, in prune_model
    prune_conv( m.conv2, block_prune_probs[blk_id] )
  File "d:\Codes\face\python\FV\Pruning\prune.py", line 42, in prune_conv
    plan.exec()
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch_pruning\dependency.py", line 247, in exec
    _, n = dep(idxs, dry_run=dry_run)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch_pruning\dependency.py", line 209, in __call__
    result = self.handler(self.broken_node.module, idxs, dry_run=dry_run)
  File "C:\Users\User\Anaconda3\Lib\site-packages\torch_pruning\prune\structured.py", line 138, in prune_batchnorm
    layer.running_mean = layer.running_mean.data.clone()[keep_idxs]
IndexError: index 33 is out of bounds for dimension 0 with size 31

Coderx7 avatar Jul 21 '20 05:07 Coderx7

Hi @Coderx7 please try a lower pruning rate, Due to the complicated dependency in your model, pruning one layer may afftects many layers. I will add an warning for negative channels in next version.

VainF avatar Jul 21 '20 06:07 VainF

@VainF Thanks a lot for the prompt response, really appreciate it. lowering the pruning rate for the last 2 layers, actually solved this. but could you kindly tell me whats going wrong here that causes this? and if I wanted larger pruning rates, what I need to be doing? where should I be fixing/changing? Really appreciate your kind help

side note:
Actually it seems other than the first two layers, that I can freely set any pruning ratio (I tested up to 0.5), all other layers cant go any further than 0.2!)

Coderx7 avatar Jul 21 '20 06:07 Coderx7

It's a little confusing. I'm going to manually check the correctness of pruning plan to see what's wrong. Maybe there is still some unknown issues in dependency detection.

VainF avatar Jul 22 '20 02:07 VainF

Thanks, really appreciate your time and professionalism. Best regards

Coderx7 avatar Jul 22 '20 05:07 Coderx7