xgrammar icon indicating copy to clipboard operation
xgrammar copied to clipboard

Add Llama.cpp benchmark experiment?

Open zlwu92 opened this issue 11 months ago • 2 comments

Hi,

I am a beginner in LLM and am new to learn structure generation with Xgrammer. I find that you have provided the benchmark results for Llama.cpp in the blog post and paper. However, I do not find the benchmark experiment in the open source Xgrammer repo: examples/benchmark/bench_grammar_compile_mask_gen.py (I think it should be written here?) If so, would you please add the test code snapshot for benchmarking Llama.cpp and show how to integrate it with Xgrammer? Thanks.

Another question is when I run the python bench_grammar_compile_mask_gen.py --backend lmformatenforcer , I got the following error

Image with the same dataset in the file downloaded from huggingface. Image Image

What might be the problem?

zlwu92 avatar Feb 11 '25 09:02 zlwu92

Hi @zlwu92, thanks for asking questions about beginning to use XGrammar and testing about llama.cpp.

For beginners, I would suggest following our tutorial that describes how to use xgrammar and huggingface transformer to guide the generation process. It's easy to learn and a very useful application scenario.

Regrading benchmark, the benchmark of llama.cpp and its internal grammar engine was done on our own fork because we needed to measure the speed of grammar initialization and mask generation.

show how to integrate it with Xgrammer

We do have a plan to integrate XGrammar into llama.cpp because we have a C++ API with complete features. That would come later.

Other baselines have changed a bit since we did our benchmark. We will update the script accordingly to make it work.

Ubospica avatar Feb 12 '25 01:02 Ubospica

Thank you.

Currently, does the open-sourced xgrammer include scripts for the two benchmarking experiments (1. speed of masking logits and 2. end-to-end evaluation for the LLM inference engine efficiency in serving scenarios) in the paper or not?

zlwu92 avatar Feb 12 '25 13:02 zlwu92