Li Ming comments

Repositories
Issues
Comments

Results 4 comments of


                                            Li Ming

Clarification on API Endpoint: /v1/completions vs /v1/chat/completions

hi, I met the same problem as yours, have you solved this problem?

Clarification on API Endpoint: /v1/completions vs /v1/chat/completions

> > hi, I met the same problem as yours, have you solved this problem? > > try to use llama-cpp-python replace llama.cpp start server Thanks for your reply, I'...

Clarification on API Endpoint: /v1/completions vs /v1/chat/completions

> > What model here is being evaluated? I will try to look into this soon. > > Qwen1.8B I discovered that the original script gguf.py, when matched with llama-cpp-python...

What other library that optimum-benchmark support other than transformer

I want to evaluate the inference latency, throughput, parameter numbers of a custom llm