TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Missing kernels for sm_87 (Jetson Orin AGX)

Open hidoba opened this issue 10 months ago • 1 comments

System Info

Jetson Orin AGX, using the version 0.10 from pip

Who can help?

No response

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

Run official examples for Llama.

Expected behavior

Use optimized fused MHA kernels.

actual behavior

[TensorRT-LLM][WARNING] Fall back to unfused MHA because of unsupported head size 128 in sm_87.

additional notes

Version 0.10 is available in pip for Jetson Orin AGX. However, it's not very useful because the compiled kernels are missing for sm_87.

hidoba avatar Apr 28 '24 17:04 hidoba

TensorRT-LLM does not have the sm 87 fused mha kernels now. If you are interested, we can change this issue to feature request.

byshiue avatar Apr 30 '24 03:04 byshiue

Close it now and you may reopen it as a feature request.

nv-guomingz avatar Jun 04 '24 02:06 nv-guomingz