how to improve inference time
My code is:
def extract_global_vector(backbone, img_tensor, device):
x = img_tensor.unsqueeze(0).to(device)
with torch.inference_mode():
feats = backbone.get_intermediate_layers(x, n=range(12), reshape=True, norm=False)
last = feats[-1].squeeze(0)
if last.ndim == 3:
vec = last.view(last.shape[0], -1).mean(dim=1)
elif last.ndim == 2:
vec = last.mean(dim=0)
else:
raise RuntimeError(f"Unexpected feature shape: {last.shape}")
vec = vec.detach().cpu().numpy().astype(np.float32)
norm = np.linalg.norm(vec) + 1e-8
vec = vec / norm
return vec
I want to reduce the time of feature extraction, the main time cost is in this line:
feats = backbone.get_intermediate_layers(x, n=range(12), reshape=True, norm=False)
Is there any way to improve it?
Maybe try to torch.compile your backone? ( backbone=torch.compile(backbone) )
Plus, if your are doing last = feats[-1], would it not be better to just use forward_features or get_intermediate_layers(x, n=1)? (this is a question and I'm not sure they do the same thing as what you are doing here)
Maybe try to torch.compile your backone? ( backbone=torch.compile(backbone) )
Plus, if your are doing last = feats[-1], would it not be better to just use forward_features or get_intermediate_layers(x, n=1)? (this is a question and I'm not sure they do the same thing as what you are doing here)
I tried feats = backbone.forward_features(x) or feats = backbone.get_intermediate_layers(x, n=[11], reshape=True, norm=False) and they are not really faster. But backbone=torch.compile(backbone) did work, thanks a lot! Would like to know more ways of improvement
If you're using a GPU you should also auto cast your forward call here to fp16!