TensorRT-LLM
TensorRT-LLM copied to clipboard
When will FP8 be available for Mixtral?
Could you guys share rough timeline on the support of FP8 quantization for Mixtral (MoE) model?
cc: @Tracin