TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

When will FP8 be available for Mixtral?

Open Pernekhan opened this issue 11 months ago • 11 comments

Could you guys share rough timeline on the support of FP8 quantization for Mixtral (MoE) model?

cc: @Tracin

Pernekhan avatar Mar 04 '24 22:03 Pernekhan