gguf-tools
gguf-tools copied to clipboard
GGUF implementation in C as a library and a tools CLI program
This PR adds a Github Action that automatically builds `gguf-tools` with Cosmopolitan Libc (cc @jart) when creating a tag+release. The action creates a zip file containing `gguf-tools.com`, a fat binary...
[These](https://github.com/ggerganov/llama.cpp/blob/7dcbe39d36b76389f6c5cd3b151928472b7e22ff/ggml.h#L354-L355) were added in https://github.com/ggerganov/llama.cpp/pull/4773 It's annoying that I8 used to be 16 and it's now 18. I16 and I32 also changed. [Dequantization code is very cryptic](https://github.com/ggerganov/llama.cpp/blob/9ecdd12e95aee20d6dfaf5f5a0f0ce5ac1fb2747/ggml-quants.c#L3457-L3508). I would love...
After closely analyzing Google Brain codebases, we decided that flushing to zero was the wrong thing to do. Intel and AMD probably designed their microprocessors to always flush to zero...
Sometimes it's useful to get an overview of how tensors changes when using different quantization formats. For example: diff -u
Currently gguf-tools (as used in e.g. MLX - a popular choice on Apple platforms) cannot open `*.gguf` files on iOS, if those gguf files are in the app bundle. The...