ggml icon indicating copy to clipboard operation
ggml copied to clipboard

How to deploy my own model using ggml framework

Open Francis235 opened this issue 11 months ago • 3 comments

How should I convert my model(e.g. .onnx format) to .gguf format and perform inference under the ggml inference framework? How should I implement it step by step?

Francis235 avatar Mar 12 '24 09:03 Francis235

It would be easier to start from a tensorflow or pytorch model than onnx. onnx operations are lower level than most ggml operations.

slaren avatar Mar 12 '24 11:03 slaren

So how to convert my pytorch model to .gguf format and perform inference under the ggml inference framework? Is there any tutorial that can guide me step by step on how to do this? I don't know how to start.

It would be easier to start from a tensorflow or pytorch model than onnx. onnx operations are lower level than most ggml operations.

Francis235 avatar Mar 13 '24 03:03 Francis235

There isn't a step by step guide, you would have to write a program to convert the weights to a format that ggml can understand (ideally GGUF), and then you would need to look at the inference code in python and convert it to ggml operations. The examples show how to do this, but it is not explained step by step, you would have to fill in the blanks.

slaren avatar Mar 13 '24 12:03 slaren