llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

[question] how do I save a loaded model?

Open 0wwafa opened this issue 1 year ago • 4 comments

using llama.cpp library I do: struct llama_model* model = llama_load_model_from_file(input_model_path, params);

How do I save it back to disk in gguf format?

I'm asking because I wrote a program to modify model weights.. so I load a GGUF then modify model weights and then I need to save it back.

0wwafa avatar Jul 19 '24 21:07 0wwafa

How do I save it back to disk in gguf format?

This is currently not implemented

ggerganov avatar Jul 20 '24 14:07 ggerganov

@ggerganov that would be very useful.

0wwafa avatar Jul 20 '24 17:07 0wwafa

llama_model interface does not allowing modifying tensors. It's a read-only representation of the loaded model.

If you want to modify tensors, either using gguf_* functions provided by ggml, or to use gguf-py to modify them in python (note: python does not support reading Q-type quants)

You can read the examples/gguf to see how it works

ngxson avatar Jul 21 '24 11:07 ngxson

llama_model interface does not allowing modifying tensors. It's a read-only representation of the loaded model.

If you want to modify tensors, either using gguf_* functions provided by ggml, or to use gguf-py to modify them in python (note: python does not support reading Q-type quants)

You can read the examples/gguf to see how it works

Nevermind. I modified the quantize program and now I can modify tensors of any model at any quantization level. Too bad llama.cpp does not support this.

0wwafa avatar Jul 22 '24 10:07 0wwafa

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Sep 05 '24 01:09 github-actions[bot]