FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

When the VIT model runs on trt, the kernel selection question

Open lixiaolx opened this issue 3 years ago • 1 comments

Description

When using FasterTransformer to perform the TRT test of the VIT model, by grabbing the nsys information, it is found that the used kernel and hardware sm are not compatible.

The corresponding sm86 should be selected to run the kernel on the A10, but the actual analysis found that the sm80 is used. Will this affect the performance of the VIT on the A10?

image

Reproduced Steps

In actual use, sm86 is referenced and specified:
https://github.com/NVIDIA/FasterTransformer/blob/main/docs/vit_guide.md#setup
cmake -DSM=86 -DCMAKE_BUILD_TYPE=Release -DBUILD_PYT=ON -DBUILD_TRT=ON ..
Environment A10, cuda11.6, trt8.4

lixiaolx avatar Aug 30 '22 07:08 lixiaolx

That's because sm86 reuse the kernels of sm80. It is an expected behavior.

byshiue avatar Aug 30 '22 07:08 byshiue

Close this bug because it is inactivated. Feel free to re-open this bug if you still have any problem.

byshiue avatar Dec 02 '22 14:12 byshiue