auto-round icon indicating copy to clipboard operation
auto-round copied to clipboard

Set the default scale_dtype to FP16

Open wenhuach21 opened this issue 9 months ago • 0 comments

There's no necessity to use FP32 scale for packing with the autogptq Triton backend. We can instead set FP16 scale dtype as the default. Nonetheless, it's essential to validate accuracy for some models.

wenhuach21 avatar May 06 '24 13:05 wenhuach21

aligned

wenhuach21 avatar May 16 '24 09:05 wenhuach21