sglang
sglang copied to clipboard
[Bug] Llava-v1.6-34B template is not updated.
Reference to https://github.com/haotian-liu/LLaVA/blob/7440ec9ee37b0374c6b5548818e89878e38f3353/llava/serve/gradio_web_server.py#L176, the chat template used by llava-v1.6-34b is 'chatml_direct' which is not implement in current SGLANG. The template 'chatml' is implemented, but totally different from 'chatml_direct'.
The bug leads to the different outputs between the gradio demo and sgl.function with sgl runtime.
Besides, the template structure and notation are totally different. I am not sure that I can transfer the chat template from llava to ChatTemplate correctly.
@tzjtatata Hi, this is the chat template for llava-v1.6-34b https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py#L224-L225
https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py#L120-L133
As for the bugs leading to difference outputs, I believe the results lie somewhere else. Could you please provide more details?
Thank you for valuable response. I think the bug can be fixed by updating sglang. Further, I cannot find the chat templates for llava-v1.6-vicuna-7b and llava-v1.6-vicuna-13b. The repository llava and lmms-eval do not contain the chat template and it fails with 'vicuna_v1' and 'llava-v1' chat template. By the way, the tokenizer for the two model in SGLang is also confused (llava-v1.5-7b-hf and llava-v1.5-13b-hf). (https://github.com/haotian-liu/LLaVA#:~:text=Tokenizers%20(temporary)%3A%20llava%2Dhf/llava%2D1.5%2D7b%2Dhf%2C%20llava%2Dhf/llava%2D1.5%2D13b%2Dhf%2C%20liuhaotian/llava%2Dv1.6%2D34b%2Dtokenizer.) Thank you for your support!
------------------ 原始邮件 ------------------ 发件人: "sgl-project/sglang" @.>; 发送时间: 2024年3月12日(星期二) 晚上9:59 @.>; @.@.>; 主题: Re: [sgl-project/sglang] [Bug] Llava-v1.6-34B template is not updated. (Issue #285)
@tzjtatata Hi, this is the chat template for llava-v1.6-34b https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py#L224-L225
https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py#L120-L133
As for the bugs leading to difference outputs, I believe the results lie somewhere else. Could you please provide more details?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
@tzjtatata
- The chat template can be determinated by this: https://github.com/haotian-liu/LLaVA/blob/7440ec9ee37b0374c6b5548818e89878e38f3353/llava/serve/gradio_web_server.py#L166-L193 Can you try to register these chat templates and match function like this? https://github.com/sgl-project/sglang/blob/ad1dd74673a2e918a39d869865c1830fb634d150/python/sglang/lang/chat_template.py
- Yes, these tokenizers are confusing, you can refer to llava's README, using
--tokenizer-pathto specify the tokenizer in our sglang.
This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.