optimum-benchmark
optimum-benchmark copied to clipboard
What other library that optimum-benchmark support other than transformer
Can I use Optimum-benchmark to evaluate the performance of qwen.cpp or llama.cpp?
I want to evaluate the inference latency, throughput, parameter numbers of a custom llm
optimum-benchmark
support transformers
, timm
and diffusers
as part of the pytorch
backend. And then there's optimum
and its subpackages like optimum-intel
, optimum-nvidia
, etc.
There's also a text-generation-server
backend which uses docker-py
to benchmark the server end-to-end (i.e. with communications overhead).
currently benchmarking llama.cpp
is not supported, how would you suggest implementing it ? might possible with a server client api (like tgi) but I'm not sure.
@L1-M1ng I would love to review a PR with llama.cpp support, https://github.com/abetlen/llama-cpp-python seems to be the most starred python bindings
llama cpp support added in #231 🚀
@IlyasMoutawwakil , I am trying to run optimum-benchmark --config-dir examples/ --config-name llama_cpp_text_generation
but I got this error
Error getting class at optimum_benchmark.backends.llama_cpp.backend.LlamaCppBackend: Error loading 'optimum_benchmark.backends.llama_cpp.backend.LlamaCppBackend'