torch2trt
                                
                                
                                
                                    torch2trt copied to clipboard
                            
                            
                            
                        CUDA out of memory
Hi, there. I see many people talked about this problem, so is there any solutions? many thanks!
Traceback (most recent call last):
  File "model2trt.py", line 134, in <module>
    tokenizer, eod_id, sep_id, unk_id, model = load_model()
  File "model2trt.py", line 108, in load_model
    model_trt = torch2trt(model, [x], max_batch_size=1)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch2trt-0.3.0-py3.7.egg/torch2trt/torch2trt.py", line 528, in torch2trt
    outputs = module(*inputs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 1057, in forward
    return_dict=return_dict,
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 895, in forward
    output_attentions=output_attentions,
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 432, in forward
    feed_forward_hidden_states = self.mlp(hidden_states)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 360, in forward
    hidden_states = self.act(hidden_states)
  File "/home/kingsoft/anaconda3/lib/python3.7/site-packages/transformers/activations.py", line 42, in gelu_new
    return 0.5 * x * (1.0 + torch.tanh(math.sqrt(2.0 / math.pi) * (x + 0.044715 * torch.pow(x, 3.0))))
RuntimeError: CUDA out of memory. Tried to allocate 1.72 GiB (GPU 0; 15.90 GiB total capacity; 14.22 GiB already allocated; 979.44 MiB free; 14.27 GiB reserved in total by PyTorch)
                                    
                                    
                                    
                                
try add torch.cuda.empty_cache() after https://github.com/NVIDIA-AI-IOT/torch2trt/blob/2732b35ac4dbe3d6e93cd74910af4e4729f1d93b/torch2trt/torch2trt.py#L553
could you tell me how to use this torch2trt to speedup GPT2?I tried the example code, however it doesn't work
It seems that torch2trt do not support GPT2 at the moment