MiniCPM icon indicating copy to clipboard operation
MiniCPM copied to clipboard

[Feature Request]: Need LoRA model in `.gguf` format

Open bioinformatist opened this issue 1 year ago • 2 comments

Feature request / 功能建议

Similar to #231, but useful

Hey my dear bros, we're building an RAG application (especially for one of our products) using MiniCPM3. Below is our stack:

Type Component
LLM MiniCPM3
Web server Shuttle | Axum
OpenAI-compatible API server llama.cpp
Vector database qdrant

It's almost done.

As MiniCPM3 comes with an RAG suite, we'd like to use the LoRA adapter for better performance, just like:

# Suppose we already have downloaded MiniCPM3-4B and MiniCPM3-RAG-LoRA-GGUF models in current directory
docker run --rm -it -p 8080:8080 -v $PWD/MiniCPM3-4B-GGUF:/models -v $PWD/MiniCPM3-RAG-LoRA-GGUF:/lora --gpus all ghcr.io/ggerganov/llama.cpp:server-cuda -m models/minicpm3-4b-q4_k_m.gguf --host 0.0.0.0 --port 8080 --n-gpu-layers 99 -v -ub 1024 -b 4096 --lora lora/lora-adapter-fp16.gguf

And the LoRA model cannot be converted to .gguf format now as the https://github.com/ggerganov/llama.cpp/pull/9396 haven't be merged:

# As ditto
docker run -it --rm --entrypoint /app/convert_lora_to_gguf.py -v $PWD/MiniCPM3-4B:/models -v $PWD/MiniCPM3-RAG-LoRA:/lora ghcr.io/ggerganov/llama.cpp:full --outtype q8_0 --base /models /lora

It said:

The repository for /models contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//models.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Or could you give us some tips for converting? Thanks a lot!

MiniCPM3 is, de facto, an ideal edge-side LLM for small companies.

bioinformatist avatar Sep 20 '24 15:09 bioinformatist

Hello, I think the best solution at present is to merge the original weights of lora and minicpm3, and then start your process

LDLINGLINGLING avatar Sep 22 '24 13:09 LDLINGLINGLING

Got. Let me have a try.

bioinformatist avatar Sep 23 '24 03:09 bioinformatist