Kermit Griffeth comments

Results 10 comments of


                                            Kermit Griffeth

[Performance]:

![Image](https://github.com/user-attachments/assets/8f478cf5-5f06-420c-90bd-403cafd42690)

[Performance]:

i sloved the first problem by pip install flashinfer-python==0.2.2 and --enforce-eager,but performance of version 0.8.3 still not as good as that of version 0.6.3.

[Performance]:

Since I only have an RTX 4090 24G device at hand, I have reproduced the previous issue (where version 0.8.5 performs worse than 0.6.3) on this device, and I am...

[Performance]:

> Is the command the same in both versions? yes

ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

python3 build_visual_engine.py --model_path tmp/hf_models/${MODEL_NAME} --model_type vila --vila_path ${VILA_PATH} # for VILA /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:128: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead. warnings.warn( [TensorRT-LLM]...

ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

> > You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending a default...

ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

> > > > You can add a chat_template entry to the LLM's tokenizer_config.json file. The old models don't have that entry and I think the HuggingFace library was appending...

Kermit Griffeth

[Performance]:

[Performance]:

[Performance]:

[Performance]:

全志A133 OPENCL性能测试

全志A133 OPENCL性能测试

How to run on A100 40G?