CLIP
CLIP copied to clipboard
openai/clip-vit-large-patch14 cannot be traced with torch_tensorrt.compile
I am able to trace the model with torch.jit.trace
, and I know the shape of input tensors. I don't think I am using any tensors outside the GPU, but I keep getting the error msg when trying to trace this model with tensorRT:
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Below is my code:
from PIL import Image
from transformers import AutoTokenizer, AutoModel
import transformers
import torch
import time
import torch_tensorrt
from transformers import CLIPProcessor, CLIPModel
import requests
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14", return_dict=False, torchscript=True)
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True)
batch_size = 1
batch_encoded = (
torch.repeat_interleave(inputs["input_ids"], batch_size, 0) ,
torch.repeat_interleave(inputs["pixel_values"], batch_size, 0) ,
)
### jit trace model
jit_model = torch.jit.trace(model, [batch_encoded[0], batch_encoded[1]])
### use tensorrt to trace jit model
new_level = torch_tensorrt.logging.Level.Error
torch_tensorrt.logging.set_reportable_log_level(new_level)
enabled_precisions = {torch.float, torch.half}
inputs = [
torch_tensorrt.Input(
shape=[2, 7], dtype=torch.int32,
),
torch_tensorrt.Input(
shape=[1, 3, 224, 224], dtype=torch.float32,
)
]
trt_model = torch_tensorrt.compile(jit_model.to(device),
inputs= inputs,
enabled_precisions= {torch.float32}, # Run with 32-bit precision
workspace_size=2000000000,
truncate_long_and_double=True
)
I am using one Nvidia A10G GPU card which is in AWS g5.2xl instance.
My code works for other models. So is it because your model somehow has some intermediate tensors being mapped to CPU?