PaddleClas
PaddleClas copied to clipboard
TensorRT inference: no speed gain
Hi,
We are trying to use TensorRT to speed up inference. In particular, we are using a ViT_base_patch16_384 model, and installed a version of paddlepaddle_gpu compiled with TensorRT. However, the inference speed remains about the same with use_tensorrt=True and use_fp16=True. This is in contrast with the paddle-ocr repo, where we do observe a big increase in speed, especially with fp16.
Any idea what could be the reason? Any help is appreciated, thanks. Relevant code:
clas = PaddleClas(
inference_model_dir=f'{PaddleClsConfig.model_path}/export',
resize_short=384,
crop_size=384,
class_id_map_file=PaddleClsConfig.labels_path,
use_tensorrt=True,
use_fp16=True,
enable_mkldnn=True,
)
result = clas.predict(img, print_pred=True)
scores = next(result)
print(scores)
print(np.argmax(scores), f'{np.max(scores):.4f}')
Hi, the problem may be caused by the poor optimization of TensorRT to ViT series model now. You can try another models, such ResNet.
Thanks for the answer. We have a lot of classes, and found that ViT performs better than ResNet.