mklachko
Results
1
issues of
mklachko
The [README](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/quantization) says: > nvfp4: Weights are quantized to NVFP4 block-wise with size 16. Activation global scale are calibrated. > fp8: Weights are quantized to FP8 tensor wise. Activation ranges...
triaged
Low Precision
Investigating