torch2trt CUDA out of memory

Hi, there. I see many people talked about this problem, so is there any solutions? many thanks!

Traceback (most recent call last):
  File "model2trt.py", line 134, in <module>
    tokenizer, eod_id, sep_id, unk_id, model = load_model()
  File "model2trt.py", line 108, in load_model
    model_trt = torch2trt(model, [x], max_batch_size=1)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch2trt-0.3.0-py3.7.egg/torch2trt/torch2trt.py", line 528, in torch2trt
    outputs = module(*inputs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 1057, in forward
    return_dict=return_dict,
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 895, in forward
    output_attentions=output_attentions,
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 432, in forward
    feed_forward_hidden_states = self.mlp(hidden_states)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 360, in forward
    hidden_states = self.act(hidden_states)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/activations.py", line 42, in gelu_new
    return 0.5 * x * (1.0 + torch.tanh(math.sqrt(2.0 / math.pi) * (x + 0.044715 * torch.pow(x, 3.0))))
RuntimeError: CUDA out of memory. Tried to allocate 1.72 GiB (GPU 0; 15.90 GiB total capacity; 14.22 GiB already allocated; 979.44 MiB free; 14.27 GiB reserved in total by PyTorch)

Jan 26 '22 06:01 Biaocsu

try add torch.cuda.empty_cache() after https://github.com/NVIDIA-AI-IOT/torch2trt/blob/2732b35ac4dbe3d6e93cd74910af4e4729f1d93b/torch2trt/torch2trt.py#L553

Jan 26 '22 18:01 tingyumao94

could you tell me how to use this torch2trt to speedup GPT2？I tried the example code， however it doesn't work

Jun 30 '22 01:06 Fino2020

It seems that torch2trt do not support GPT2 at the moment

Jun 30 '22 02:06 Biaocsu

torch2trt torch2trt copied to clipboard

CUDA out of memory

torch2trt
torch2trt copied to clipboard