[question] how do I save a loaded model?
using llama.cpp library I do:
struct llama_model* model = llama_load_model_from_file(input_model_path, params);
How do I save it back to disk in gguf format?
I'm asking because I wrote a program to modify model weights.. so I load a GGUF then modify model weights and then I need to save it back.
How do I save it back to disk in gguf format?
This is currently not implemented
@ggerganov that would be very useful.
llama_model interface does not allowing modifying tensors. It's a read-only representation of the loaded model.
If you want to modify tensors, either using gguf_* functions provided by ggml, or to use gguf-py to modify them in python (note: python does not support reading Q-type quants)
You can read the examples/gguf to see how it works
llama_modelinterface does not allowing modifying tensors. It's a read-only representation of the loaded model.If you want to modify tensors, either using
gguf_*functions provided by ggml, or to usegguf-pyto modify them in python (note: python does not support reading Q-type quants)You can read the
examples/ggufto see how it works
Nevermind. I modified the quantize program and now I can modify tensors of any model at any quantization level. Too bad llama.cpp does not support this.
This issue was closed because it has been inactive for 14 days since being marked as stale.