FasterTransformer
FasterTransformer copied to clipboard
When the VIT model runs on trt, the kernel selection question
Description
When using FasterTransformer to perform the TRT test of the VIT model, by grabbing the nsys information, it is found that the used kernel and hardware sm are not compatible.
The corresponding sm86 should be selected to run the kernel on the A10, but the actual analysis found that the sm80 is used. Will this affect the performance of the VIT on the A10?

Reproduced Steps
In actual use, sm86 is referenced and specified:
https://github.com/NVIDIA/FasterTransformer/blob/main/docs/vit_guide.md#setup
cmake -DSM=86 -DCMAKE_BUILD_TYPE=Release -DBUILD_PYT=ON -DBUILD_TRT=ON ..
Environment A10, cuda11.6, trt8.4
That's because sm86 reuse the kernels of sm80. It is an expected behavior.
Close this bug because it is inactivated. Feel free to re-open this bug if you still have any problem.