huggingface.js icon indicating copy to clipboard operation
huggingface.js copied to clipboard

feat: add GGMLFileQuantizationType and apply to test

Open snowyu opened this issue 1 year ago • 1 comments

@mishig25 that's it for #794

snowyu avatar Jul 17 '24 00:07 snowyu

cc @ngxson too

julien-c avatar Jul 17 '24 07:07 julien-c

FYI, I added the MOSTLY_ prefix in the last commit, to better reflect the type name from ggml (see here)

The reason is because many operations in ggml only support F32 for 1d tensors. So in fact, gguf file is never "purely" quantized, but rather being a mix between quantized type and F32.

ngxson avatar Aug 16 '24 10:08 ngxson

BTW, i also propose to display the enum's key name in a tooltip inside the GGUF file viewer, like this:

image

(internal PR)

julien-c avatar Aug 16 '24 16:08 julien-c

i'll let you merge @ngxson!

julien-c avatar Aug 16 '24 17:08 julien-c

@ngxson be careful, the const is not in ggml.h, it's in llama.h.

snowyu avatar Aug 17 '24 08:08 snowyu

Yeah I linked to the incorrect file, but the content is not changed anyway because I only added MOSTLY_ on top of your commit. (So everything is still correct)

ngxson avatar Aug 17 '24 08:08 ngxson