mklachko

Results 1 issues of mklachko

The [README](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/quantization) says: > nvfp4: Weights are quantized to NVFP4 block-wise with size 16. Activation global scale are calibrated. > fp8: Weights are quantized to FP8 tensor wise. Activation ranges...

triaged
Low Precision
Investigating