torch2trt
torch2trt copied to clipboard
CUDA out of memory
Hi, there. I see many people talked about this problem, so is there any solutions? many thanks!
Traceback (most recent call last):
File "model2trt.py", line 134, in <module>
tokenizer, eod_id, sep_id, unk_id, model = load_model()
File "model2trt.py", line 108, in load_model
model_trt = torch2trt(model, [x], max_batch_size=1)
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch2trt-0.3.0-py3.7.egg/torch2trt/torch2trt.py", line 528, in torch2trt
outputs = module(*inputs)
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 1057, in forward
return_dict=return_dict,
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 895, in forward
output_attentions=output_attentions,
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 432, in forward
feed_forward_hidden_states = self.mlp(hidden_states)
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 360, in forward
hidden_states = self.act(hidden_states)
File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/activations.py", line 42, in gelu_new
return 0.5 * x * (1.0 + torch.tanh(math.sqrt(2.0 / math.pi) * (x + 0.044715 * torch.pow(x, 3.0))))
RuntimeError: CUDA out of memory. Tried to allocate 1.72 GiB (GPU 0; 15.90 GiB total capacity; 14.22 GiB already allocated; 979.44 MiB free; 14.27 GiB reserved in total by PyTorch)
try add torch.cuda.empty_cache() after https://github.com/NVIDIA-AI-IOT/torch2trt/blob/2732b35ac4dbe3d6e93cd74910af4e4729f1d93b/torch2trt/torch2trt.py#L553
could you tell me how to use this torch2trt to speedup GPT2?I tried the example code, however it doesn't work
It seems that torch2trt do not support GPT2 at the moment