llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (#10976)

Open gcp opened this issue 1 day ago • 1 comments

Using templates and reusing the dequant_qX_Y functions.

gcp avatar Feb 21 '25 09:02 gcp