AMDMIGraphX icon indicating copy to clipboard operation
AMDMIGraphX copied to clipboard

[Q] Is nearbyint necessary for the FP8 quantizelinar ?

Open umangyadav opened this issue 1 year ago • 0 comments

For the interger quantization, quantizelinear operation is nearbyint(x / scale) + zeropoint.

FP8 is floating point operation already, Is nearbyint necessary in that case ?

umangyadav avatar Jan 19 '24 14:01 umangyadav