torchchat Explore expanding dynamic quantization kernels (broaden a8w4dq support)

Explore expanding dynamic quantization kernels (broaden a8w4dq support)

Open mikekgfb opened this issue 9 months ago • 1 comments

Add support for 1 - asymmetric a8w4dq, basically require to subtract zero from each value before multiplying, so should add a single multiply. This will help accelerate and better handle GGUF files on executorch and read Q4_0 for export to mobile. 2 - a8w4dq on desktop 3 - the asymmetric version of (1) on desktop

May 03 '24 13:05 mikekgfb

torchchat torchchat copied to clipboard

Explore expanding dynamic quantization kernels (broaden a8w4dq support)

torchchat
torchchat copied to clipboard