TensorRT-LLM Missing kernels for sm

Missing kernels for sm_87 (Jetson Orin AGX)

Open hidoba opened this issue 10 months ago • 1 comments

Jetson Orin AGX, using the version 0.10 from pip

No response

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Run official examples for Llama.

Use optimized fused MHA kernels.

[TensorRT-LLM][WARNING] Fall back to unfused MHA because of unsupported head size 128 in sm_87.

Version 0.10 is available in pip for Jetson Orin AGX. However, it's not very useful because the compiled kernels are missing for sm_87.

Apr 28 '24 17:04 hidoba

TensorRT-LLM does not have the sm 87 fused mha kernels now. If you are interested, we can change this issue to feature request.

Apr 30 '24 03:04 byshiue

Close it now and you may reopen it as a feature request.

Jun 04 '24 02:06 nv-guomingz