Juarez Bochi

Results 5 issues of Juarez Bochi

According to Docker's best practices, `COPY` is [preferred](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#add-or-copy) [Dockle](https://github.com/goodwithtech/dockle) also reports this as a potential vulnerability: ``` FATAL - CIS-DI-0009: Use COPY instead of ADD in Dockerfile * Use COPY...

OCA Required

## Proposed changes This adds [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) support using the excellent [gguflib](https://github.com/antirez/gguf-tools/blob/main/gguflib.h) from @antirez. Would there be interest in this? GGUF is currently very popular for local inference, and there are...

This loads all weights, config, and vocab directly from a GGUF file using https://github.com/ml-explore/mlx/pull/350 Example run: ```bash $ python llama.py models/tiny_llama/model.gguf [INFO] Loading model from models/tiny_llama/model.gguf. Press enter to start...

### System Info ```shell Reproduced on Mac, Python 3.11 and Google Colab / Python 3.10 optimum==1.14.0 ``` ### Who can help? @ michaelbenayoun ### Information - [ ] The official...

bug

[These](https://github.com/ggerganov/llama.cpp/blob/7dcbe39d36b76389f6c5cd3b151928472b7e22ff/ggml.h#L354-L355) were added in https://github.com/ggerganov/llama.cpp/pull/4773 It's annoying that I8 used to be 16 and it's now 18. I16 and I32 also changed. [Dequantization code is very cryptic](https://github.com/ggerganov/llama.cpp/blob/9ecdd12e95aee20d6dfaf5f5a0f0ce5ac1fb2747/ggml-quants.c#L3457-L3508). I would love...