FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

Support for Falcon models

Open ankit201 opened this issue 1 year ago • 1 comments

Since Falcon is a multi query attention model, and FT doesn't support multi query attention model's conversion, do we've a support planned for this?

ankit201 avatar Jun 14 '23 10:06 ankit201

FasterTransformer development has transitioned to TensorRT-LLM.

Falcon is supported in TensorRT-LLM, please refer this example.

byshiue avatar Oct 20 '23 07:10 byshiue