optimum-benchmark icon indicating copy to clipboard operation
optimum-benchmark copied to clipboard

What other library that optimum-benchmark support other than transformer

Open L1-M1ng opened this issue 1 year ago • 3 comments

Can I use Optimum-benchmark to evaluate the performance of qwen.cpp or llama.cpp?

L1-M1ng avatar Feb 01 '24 07:02 L1-M1ng

I want to evaluate the inference latency, throughput, parameter numbers of a custom llm

L1-M1ng avatar Feb 01 '24 07:02 L1-M1ng

optimum-benchmark support transformers, timm and diffusers as part of the pytorch backend. And then there's optimum and its subpackages like optimum-intel, optimum-nvidia, etc. There's also a text-generation-server backend which uses docker-py to benchmark the server end-to-end (i.e. with communications overhead). currently benchmarking llama.cpp is not supported, how would you suggest implementing it ? might possible with a server client api (like tgi) but I'm not sure.

IlyasMoutawwakil avatar Feb 02 '24 11:02 IlyasMoutawwakil

@L1-M1ng I would love to review a PR with llama.cpp support, https://github.com/abetlen/llama-cpp-python seems to be the most starred python bindings

IlyasMoutawwakil avatar Feb 20 '24 08:02 IlyasMoutawwakil

llama cpp support added in #231 🚀

IlyasMoutawwakil avatar Jul 30 '24 10:07 IlyasMoutawwakil

@IlyasMoutawwakil , I am trying to run optimum-benchmark --config-dir examples/ --config-name llama_cpp_text_generation but I got this error

Error getting class at optimum_benchmark.backends.llama_cpp.backend.LlamaCppBackend: Error loading 'optimum_benchmark.backends.llama_cpp.backend.LlamaCppBackend'

zaidalyafeai avatar Aug 09 '24 19:08 zaidalyafeai