PaddleClas TensorRT inference: no speed gain

TensorRT inference: no speed gain

Open mokadevcloud opened this issue 3 years ago • 2 comments

Hi, We are trying to use TensorRT to speed up inference. In particular, we are using a ViT_base_patch16_384 model, and installed a version of paddlepaddle_gpu compiled with TensorRT. However, the inference speed remains about the same with use_tensorrt=True and use_fp16=True. This is in contrast with the paddle-ocr repo, where we do observe a big increase in speed, especially with fp16.

Any idea what could be the reason? Any help is appreciated, thanks. Relevant code:

    clas = PaddleClas(
        inference_model_dir=f'{PaddleClsConfig.model_path}/export',
        resize_short=384,
        crop_size=384,
        class_id_map_file=PaddleClsConfig.labels_path,
        use_tensorrt=True,
        use_fp16=True,
        enable_mkldnn=True,
    )

    result = clas.predict(img, print_pred=True)
    scores = next(result)
    print(scores)
    print(np.argmax(scores), f'{np.max(scores):.4f}')

Mar 02 '22 17:03 mokadevcloud

Hi, the problem may be caused by the poor optimization of TensorRT to ViT series model now. You can try another models, such ResNet.

Mar 03 '22 12:03 TingquanGao

Thanks for the answer. We have a lot of classes, and found that ViT performs better than ResNet.

Mar 04 '22 12:03 mokadevcloud

PaddleClas PaddleClas copied to clipboard

TensorRT inference: no speed gain

PaddleClas
PaddleClas copied to clipboard