Li Ming

Results 4 comments of Li Ming

hi, I met the same problem as yours, have you solved this problem?

> > hi, I met the same problem as yours, have you solved this problem? > > try to use llama-cpp-python replace llama.cpp start server Thanks for your reply, I'...

> > What model here is being evaluated? I will try to look into this soon. > > Qwen1.8B I discovered that the original script gguf.py, when matched with llama-cpp-python...

I want to evaluate the inference latency, throughput, parameter numbers of a custom llm