pytorch-pruning icon indicating copy to clipboard operation
pytorch-pruning copied to clipboard

Out of memory,when I pruned model test

Open Tianxiaomo opened this issue 6 years ago • 4 comments

I use Tesla k80 -12G *4,When I prunning the training test everything was normal, but after the pruned test memory overflow. THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory Traceback (most recent call last): File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1664, in main() File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1658, in main globals = debugger.run(setup['file'], None, None, is_module) File "/home/b418-xiwei/.pycharm_helpers/pydev/pydevd.py", line 1068, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/home/b418-xiwei/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/b418-xiwei/hgh/prune/finetune.py", line 343, in fine_tuner.prune() File "/home/b418-xiwei/hgh/prune/finetune.py", line 267, in prune self.test() File "/home/b418-xiwei/hgh/prune/finetune.py", line 187, in test output = model(Variable(batch)) File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(*input, **kwargs) File "/home/b418-xiwei/hgh/prune/finetune.py", line 68, in forward x = self.features(x) File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(*input, **kwargs) File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward input = module(input) File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call result = self.forward(*input, **kwargs) File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/modules/pooling.py", line 142, in forward self.return_indices) File "/home/b418-xiwei/anaconda3/envs/distiller/lib/python3.6/site-packages/torch/nn/functional.py", line 360, in max_pool2d ret = torch._C._nn.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

I use batch_size=16,so it is

Tianxiaomo avatar May 15 '18 03:05 Tianxiaomo

What GPU are you using?

cc94226 avatar Jun 02 '18 08:06 cc94226

have you solve this problem? I found that in finetune.py line172 and line174, this two backward operation will cause doubling the memory usage twice, increasing my memory from 3200MB to 7000MB than to 11000MB. The first increment is used when getting the pruning plan, so the gradient calculated is useless when finetuning, but I haven't found any way to clear that gradient cache.

CodePlay2016 avatar Jun 05 '18 06:06 CodePlay2016

@Tianxiaomo hi, can you tell me the command of test model,i can‘t find it,thy.

weixia1 avatar Jul 13 '18 08:07 weixia1

@CodePlay2016 I am facing the almost similar out-of-memory problem

Could you comment about this ? Do you have any actual, working countermeasure so far ?

[phung@archlinux pytorch-pruning]$ python finetune.py --prune
/usr/lib/python3.7/site-packages/torchvision/transforms/transforms.py:187: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  warnings.warn("The use of the transforms.Scale transform is deprecated, " +
/usr/lib/python3.7/site-packages/torchvision/transforms/transforms.py:562: UserWarning: The use of the transforms.RandomSizedCrop transform is deprecated, please use transforms.RandomResizedCrop instead.
  warnings.warn("The use of the transforms.RandomSizedCrop transform is deprecated, " +
Accuracy:  0.5848
Number of prunning iterations to reduce 67% filters 5
Ranking filters.. 
Traceback (most recent call last):
  File "finetune.py", line 270, in <module>
    fine_tuner.prune()
  File "finetune.py", line 217, in prune
    prune_targets = self.get_candidates_to_prune(num_filters_to_prune_per_iteration)
  File "finetune.py", line 184, in get_candidates_to_prune
    self.train_epoch(rank_filters = True)
  File "finetune.py", line 179, in train_epoch
    self.train_batch(optimizer, batch.cuda(), label.cuda(), rank_filters)
  File "finetune.py", line 172, in train_batch
    self.criterion(output, Variable(label)).backward()
  File "/usr/lib/python3.7/site-packages/torch/tensor.py", line 96, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/usr/lib/python3.7/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA error: out of memory
[phung@archlinux pytorch-pruning]$ 

buttercutter avatar Oct 08 '18 04:10 buttercutter