sglang [Bug] Llava-v1.6-34B template is not updated.

Reference to https://github.com/haotian-liu/LLaVA/blob/7440ec9ee37b0374c6b5548818e89878e38f3353/llava/serve/gradio_web_server.py#L176, the chat template used by llava-v1.6-34b is 'chatml_direct' which is not implement in current SGLANG. The template 'chatml' is implemented, but totally different from 'chatml_direct'.

The bug leads to the different outputs between the gradio demo and sgl.function with sgl runtime.

Besides, the template structure and notation are totally different. I am not sure that I can transfer the chat template from llava to ChatTemplate correctly.

Mar 12 '24 02:03 tzjtatata

@tzjtatata Hi, this is the chat template for llava-v1.6-34b https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py#L224-L225

https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py#L120-L133

As for the bugs leading to difference outputs, I believe the results lie somewhere else. Could you please provide more details?

Mar 12 '24 13:03 hnyls2002

Thank you for valuable response. I think the bug can be fixed by updating sglang. Further, I cannot find the chat templates for llava-v1.6-vicuna-7b and llava-v1.6-vicuna-13b. The repository llava and lmms-eval do not contain the chat template and it fails with 'vicuna_v1' and 'llava-v1' chat template. By the way, the tokenizer for the two model in SGLang is also confused (llava-v1.5-7b-hf and llava-v1.5-13b-hf). (https://github.com/haotian-liu/LLaVA#:~:text=Tokenizers%20(temporary)%3A%20llava%2Dhf/llava%2D1.5%2D7b%2Dhf%2C%20llava%2Dhf/llava%2D1.5%2D13b%2Dhf%2C%20liuhaotian/llava%2Dv1.6%2D34b%2Dtokenizer.) Thank you for your support!

------------------ 原始邮件 ------------------ 发件人: "sgl-project/sglang" @.>; 发送时间: 2024年3月12日(星期二) 晚上9:59 @.>; @.@.>; 主题: Re: [sgl-project/sglang] [Bug] Llava-v1.6-34B template is not updated. (Issue #285)

@tzjtatata Hi, this is the chat template for llava-v1.6-34b https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py#L224-L225

https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py#L120-L133

As for the bugs leading to difference outputs, I believe the results lie somewhere else. Could you please provide more details?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Mar 12 '24 14:03 tzjtatata

@tzjtatata

The chat template can be determinated by this: https://github.com/haotian-liu/LLaVA/blob/7440ec9ee37b0374c6b5548818e89878e38f3353/llava/serve/gradio_web_server.py#L166-L193 Can you try to register these chat templates and match function like this? https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py
Yes, these tokenizers are confusing, you can refer to llava's README, using --tokenizer-path to specify the tokenizer in our sglang.

Mar 12 '24 14:03 hnyls2002

This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.

Jul 25 '24 06:07 github-actions[bot]

sglang sglang copied to clipboard

[Bug] Llava-v1.6-34B template is not updated.

sglang
sglang copied to clipboard