openvino.genai Failed to Run Benchmark for llama-3-8b-instruct and llama-3.1-8b-instruct Models.

Failed to Run Benchmark for llama-3-8b-instruct and llama-3.1-8b-instruct Models.

Open tim102187S opened this issue 1 year ago • 2 comments

I attempted to run benchmarks for the llama-3-8b-instruct and llama-3.1-8b-instruct models using both CPU and GPU, but the process failed. (I successfully tested the llama2-7b-chatbot model)

I followed the instructions in openvino_notebooks/llm-chatbot.ipynb to download the models and ensured that all necessary files (including the required tokenizer.model) were included. I am using the latest version of OpenVINO (2024.3.0) and have also upgraded the transformers library.

The command I executed is: python benchmark.py -m {path}/openvino_notebooks/notebooks/llm-chatbot/llama-3-8b-instruct/INT4_compressed_weights -n 2 -d CPU -p "What is large language model (LLM)?"

And I received the following error output: Screenshot from 2024-09-04 16-56-13

Sep 04 '24 08:09 tim102187S

Adding -ic 512 option should work around this issue. We have a PR to fix this issue, but there is a performance regression, WIP on the analysis.

Sep 05 '24 08:09 peterchen-intel

Thank you for your response. Adding the -ic 512 option indeed resolves the issue.

When can we expect the full solution to be available?

Sep 05 '24 09:09 tim102187S

@tim102187S It has been fixed on master branch, please try with latest commit ID of master branch.

Oct 28 '24 09:10 peterchen-intel

Fixed by https://github.com/openvinotoolkit/openvino.genai/pull/801, further default configuration update for OOB performance https://github.com/openvinotoolkit/openvino.genai/pull/1049

Nov 06 '24 00:11 peterchen-intel

openvino.genai openvino.genai copied to clipboard

Failed to Run Benchmark for llama-3-8b-instruct and llama-3.1-8b-instruct Models.

openvino.genai
openvino.genai copied to clipboard