nni
nni copied to clipboard
Pruning model speed up error
After using the pruning API on the model, when I try to perform the speed up, it throws this error :
Error: TypeError: forward() missing 1 required positional argument: input
Traceback:
m_speedup.speedup_model()
File "/pruning/lib/python3.6/site-packages/nni/compression/pytorch/speedup/compressor.py", line 503, in speedup model self.infer_modules_masks() File "/pruning/Python-test/lib/python3.6/site-packages/nni/compression/pytorch/speedup/compressor.py", line 349. in infer_modules masks self.update_direct_sparsity(curnode) File "/pruning/Python-test/lib/python3.6/site-packages/nni/compression/pytorch/speedup/compressor.py", line 219, in update direct sparsity state_dict-copy.deepcopy(module.state_dict()). batch_dim=self.batch_dim) File "/pruning/Python-test/lib/python3.6/site-packages/nni/compression/pytorch/speedup/infer_mask.py", line 80, in_init__ self.output = self.module("dummy_input) File "/pruning/Python-test/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward("input, **kwargs)
Hello @hitesh-hitu , which model do you prune, , we recently encountered some similar issues and will focus on fixing them in the next release.
Hello @J-shang thank you so much for the speedy response.
I am trying to prune a pytorch image enhancing model with an input shape of ( 1,3,580,580 )
. The pruning API is doing fine and converting the weights to zeroes. This error is coming up when I try to speed up the model.
Can you please help me with this soon.
pr #4149 has fixed the infer mask forward missing the required positional argument, which causes by a constant value pass to the forward input. You could have a try base on this pr, and if this issue is still occurring, information about the model structure you used can help us to fix your issue.
@J-shang just one more clarification, as per the nni docs page, the model speed up is still in beta version and the limitation is that it doesn't support speed up of all the models. Is this error related to that ? Does latest version of nni support speed up of all the pytorch models after pruning ?
@hitesh-hitu , your error may be mainly because you have some special modules in your model, and their forward input arguments have constant. But we can't locate the real cause if you don't show your model.
For your second question, speed up need to implement replacement logic for each specific pytorch module and nni have implemented most normal module right now (Conv2d, Linear...), if you meet some modules that nni not support, you can report an issue, and we will try to implement it.
@J-shang thank you very much for the clarification. I'll check for the custom layers in my model.
Also, I made the changes from the pr #4149 on the latest version of NNI, the error continues to exist. I'll look for the special layer in the model. Can you please let me know, where the changes have to be made/added to support a special layer in the speedup module ?
For a new module, need to add a replace function in https://github.com/microsoft/nni/blob/51d261e7256e2344f8d4cf270bff439819945c9a/nni/compression/pytorch/speedup/compress_modules.py#L11
But your error is occur before module replacement, you can check which module cause this error, which module this forward function belongs to.
I don't understand the issue. Can you please elaborate and also point out where to fix the issue in the speed up module.
@hitesh-hitu do you have logs before this error occurs like the following?
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for layers.0
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for downsample
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for skip
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for layers.1
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for layers.2
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for layers.3
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update mask for .aten::cat.6
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the .aten::cat.6
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the skip
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the layers.3
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the layers.2
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the layers.1
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the downsample
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) Update the indirect sparsity for the layers.0
[2022-03-23 13:12:05] INFO (nni.compression.pytorch.speedup.compressor/MainThread) resolve the mask conflict
you could find which layer update mask failed by these logs, or if you can give a simple code that we can reproduce your error. We can't determine this error and how to fix it because we don't know your model implementation.
Yes, I'll provide the logs soon, please keep this issue open.
@J-shang I use nni to pruning a model by trained mmdetection and mmrazor. My code is
config_list = [{
'sparsity_per_layer': 0.5,
'op_types': ['Conv2d'],
}]
pruner = L1NormPruner(model, config_list)
# compress the model and generate the masks
_, masks = pruner.compress()
print(masks)
# show the masks sparsity
for name, mask in masks.items():
print(name, ' sparsity : ', '{:.2}'.format(mask['weight'].sum() / mask['weight'].numel()))
# need to unwrap the model, if the model is wrapped before speedup
pruner._unwrap_model()
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
ModelSpeedup(model, torch.rand(1, 3, 1333, 800).to(device), masks).speedup_model()
The error is
Please give me some advice, thanks very much.
@Audrey528 , did model(torch.rand(1, 3, 1333, 800))
work with your model? could you find where your model use img_metas
?
@J-shang Thanks very much for your reply. (1333, 800) is the input image size of training. The data is like follow picture when doing inference for a image. Should I change the input data style?
yes, the dummy input is used for tracing the model graph, please refer usage of example_inputs
in torch.jit.trace
https://pytorch.org/docs/stable/generated/torch.jit.trace.html
执行剪枝时候不要将模型放到多个GPU~