llama.cpp
llama.cpp copied to clipboard
Feature Request: Support MobileLLMP1ForCausalLM
Prerequisites
- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the README.md.
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Add support for https://huggingface.co/spaces/akhaliq/MobileLLM-Pro
Currently the conversion script errors out:
python3 convert_hf_to_gguf.py ./MobileLLM-Pro/
INFO:hf-to-gguf:Loading model: MobileLLM-Pro
INFO:hf-to-gguf:Model architecture: MobileLLMP1ForCausalLM
ERROR:hf-to-gguf:Model MobileLLMP1ForCausalLM is not supported
Motivation
llama.cpp already supports variants of MobileLLM. It would be a great addition to also support the updated version.
Possible Implementation
- Lama.cpp already supports MobileLLM and LLama4 so I expect we can build on top of those architectures.
- I started adding support on this branch and managed to generate a GGUF.
- The current version does not create consistent results when used with
llama-cli. Any help how to debug this is appreciated.