PaddleClas icon indicating copy to clipboard operation
PaddleClas copied to clipboard

TensorRT inference: no speed gain

Open mokadevcloud opened this issue 3 years ago • 2 comments

Hi, We are trying to use TensorRT to speed up inference. In particular, we are using a ViT_base_patch16_384 model, and installed a version of paddlepaddle_gpu compiled with TensorRT. However, the inference speed remains about the same with use_tensorrt=True and use_fp16=True. This is in contrast with the paddle-ocr repo, where we do observe a big increase in speed, especially with fp16.

Any idea what could be the reason? Any help is appreciated, thanks. Relevant code:

    clas = PaddleClas(
        inference_model_dir=f'{PaddleClsConfig.model_path}/export',
        resize_short=384,
        crop_size=384,
        class_id_map_file=PaddleClsConfig.labels_path,
        use_tensorrt=True,
        use_fp16=True,
        enable_mkldnn=True,
    )

    result = clas.predict(img, print_pred=True)
    scores = next(result)
    print(scores)
    print(np.argmax(scores), f'{np.max(scores):.4f}')

mokadevcloud avatar Mar 02 '22 17:03 mokadevcloud

Hi, the problem may be caused by the poor optimization of TensorRT to ViT series model now. You can try another models, such ResNet.

TingquanGao avatar Mar 03 '22 12:03 TingquanGao

Thanks for the answer. We have a lot of classes, and found that ViT performs better than ResNet.

mokadevcloud avatar Mar 04 '22 12:03 mokadevcloud