ggml
ggml copied to clipboard
ggml vs onnxruntime on SOC chip
Hello, I would like to know if anyone has compared the inference of ggml and onnxruntime on SOC in terms of latency, memory usage, %CPU and other indicators? For example, CPU/GPU backend.
I am interested in knowing too