gguf reader for layer and size estimates
i've found that without some sort of layer and size estimate it's very hard to choose the right number of layers to offload
todo:
- get a size estimate based on needed context size!
if you think this should be it's own repo, im cool with that
Hey @earonesty this makes sense and I do want to integrate gguf more closely into llama-cpp-python. Is it possible to use the pip published gguf package to reduce the amount of maintenance required when that's updated?
Hey @earonesty this makes sense and I do want to integrate gguf more closely into
llama-cpp-python. Is it possible to use the pip publishedggufpackage to reduce the amount of maintenance required when that's updated?
unfortunately that package has no reader support. i used the source for that to reverse engineer the format and write the reader! happy to put it in its own repo, but i dont thnk the llama-cpp team has plans to maintain the reader.
i can try to submit a PR and see if they like it?