CLIP icon indicating copy to clipboard operation
CLIP copied to clipboard

openai/clip-vit-large-patch14 cannot be traced with torch_tensorrt.compile

Open kct22aws opened this issue 1 year ago • 1 comments

I am able to trace the model with torch.jit.trace, and I know the shape of input tensors. I don't think I am using any tensors outside the GPU, but I keep getting the error msg when trying to trace this model with tensorRT:

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Below is my code:

from PIL import Image
from transformers import AutoTokenizer, AutoModel
import transformers
import torch
import time
import torch_tensorrt
from transformers import CLIPProcessor, CLIPModel
import requests

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14",  return_dict=False, torchscript=True)
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True)

batch_size = 1
batch_encoded = (
        torch.repeat_interleave(inputs["input_ids"], batch_size, 0) ,
        torch.repeat_interleave(inputs["pixel_values"], batch_size, 0) ,
        )

### jit trace model 
jit_model = torch.jit.trace(model, [batch_encoded[0], batch_encoded[1]])

### use tensorrt to trace jit model
new_level = torch_tensorrt.logging.Level.Error
torch_tensorrt.logging.set_reportable_log_level(new_level)

enabled_precisions = {torch.float, torch.half}

inputs = [
    torch_tensorrt.Input(
        shape=[2, 7], dtype=torch.int32,
    ),
    torch_tensorrt.Input(
        shape=[1, 3, 224, 224], dtype=torch.float32,
    )
]


trt_model = torch_tensorrt.compile(jit_model.to(device), 
    inputs= inputs, 
    enabled_precisions= {torch.float32}, # Run with 32-bit precision
    workspace_size=2000000000,
    truncate_long_and_double=True
)

I am using one Nvidia A10G GPU card which is in AWS g5.2xl instance.

kct22aws avatar Jun 03 '23 00:06 kct22aws

My code works for other models. So is it because your model somehow has some intermediate tensors being mapped to CPU?

kct22aws avatar Jun 03 '23 14:06 kct22aws