TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

why fp8_e4m3 min_scaling_factor divide 512?

Open suxi1314 opened this issue 7 months ago • 1 comments

https://github.com/NVIDIA/TensorRT-LLM/blob/main/cpp/tensorrt_llm/common/cudaFp8Utils.cu#L219 constexpr float min_scaling_factor = 1.0f / (FP8_E4M3_MAX * 512.f); why is it 512?

suxi1314 avatar Jul 18 '24 08:07 suxi1314