huggingface.js feat: add GGMLFileQuantizationType and apply to test

@mishig25 that's it for #794

Jul 17 '24 00:07 snowyu

cc @ngxson too

Jul 17 '24 07:07 julien-c

FYI, I added the MOSTLY_ prefix in the last commit, to better reflect the type name from ggml (see here)

The reason is because many operations in ggml only support F32 for 1d tensors. So in fact, gguf file is never "purely" quantized, but rather being a mix between quantized type and F32.

Aug 16 '24 10:08 ngxson

BTW, i also propose to display the enum's key name in a tooltip inside the GGUF file viewer, like this:

(internal PR)

Aug 16 '24 16:08 julien-c

i'll let you merge @ngxson!

Aug 16 '24 17:08 julien-c

@ngxson be careful, the const is not in ggml.h, it's in llama.h.

Aug 17 '24 08:08 snowyu

Yeah I linked to the incorrect file, but the content is not changed anyway because I only added MOSTLY_ on top of your commit. (So everything is still correct)

Aug 17 '24 08:08 ngxson