llama.cpp
llama.cpp copied to clipboard

Published 20 hours ago •

Reame
Issues

cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (#10976)

Open gcp opened this issue 1 day ago • 1 comments

Using templates and reusing the dequant_qX_Y functions.

Feb 21 '25 09:02 gcp