llama2.mojo TODO: Support for gguf models

TODO: Support for gguf models

Open babycommando opened this issue 8 months ago • 1 comments

hey team, incredible work being done here.

Wondering if you only support .bin models, or would it also manage to work with gguf quantized models as well.

If not, then that's a real feature request. Mostly everyone uses gguf models to work nowadays, as they are easier to run on consumer-grade hardware.

thanks.

Oct 19 '23 15:10 babycommando