llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

gguf reader for layer and size estimates

Open earonesty opened this issue 2 years ago • 2 comments

i've found that without some sort of layer and size estimate it's very hard to choose the right number of layers to offload

todo:

  • get a size estimate based on needed context size!

if you think this should be it's own repo, im cool with that

earonesty avatar Sep 14 '23 14:09 earonesty

Hey @earonesty this makes sense and I do want to integrate gguf more closely into llama-cpp-python. Is it possible to use the pip published gguf package to reduce the amount of maintenance required when that's updated?

abetlen avatar Sep 30 '23 06:09 abetlen

Hey @earonesty this makes sense and I do want to integrate gguf more closely into llama-cpp-python. Is it possible to use the pip published gguf package to reduce the amount of maintenance required when that's updated?

unfortunately that package has no reader support. i used the source for that to reverse engineer the format and write the reader! happy to put it in its own repo, but i dont thnk the llama-cpp team has plans to maintain the reader.

i can try to submit a PR and see if they like it?

earonesty avatar Sep 30 '23 17:09 earonesty