inference 如何集成SeaLLMs/SeaLLM-7B-v2.5 Chat模式

目前xinference内置的SeaLLM-7B-v2.5是generation model，想要集成chat model，而SeaLLMs/SeaLLM-7B-v2.5的chat templates比较特殊，详见 https://huggingface.co/SeaLLMs/SeaLLM-7B-v2.5 ，请问应该如何配置？

Apr 24 '24 04:04 LivingXu

我看了下他的文档，应该用的 ChatML 的模板，感觉可以参考

https://github.com/xorbitsai/inference/blob/2ba72b0ed55c2dbff12491485ffacee7996d3490/xinference/model/llm/llm_family.json#L3480-L3501

intra_message_sep 应该是 <eos>， stop tokens 可能要去https://huggingface.co/SeaLLMs/SeaLLM-7B-v2.5/blob/main/tokenizer_config.json 查下对应关系。

欢迎你提供 PR 来增加这个模型的支持。

Apr 24 '24 04:04 qinxuye

我看了下他的文档，应该用的 ChatML 的模板，感觉可以参考

https://github.com/xorbitsai/inference/blob/2ba72b0ed55c2dbff12491485ffacee7996d3490/xinference/model/llm/llm_family.json#L3480-L3501

intra_message_sep 应该是 <eos>， stop tokens 可能要去https://huggingface.co/SeaLLMs/SeaLLM-7B-v2.5/blob/main/tokenizer_config.json 查下对应关系。

欢迎你提供 PR 来增加这个模型的支持。

感谢回复，我正在尝试进行集成。遇到了一个新问题，对于不同的模型，默认的repetition_penalty值是如何确定的？是否可以针对特定模型进行修改？

Apr 25 '24 02:04 LivingXu

这个问题解决如何？

Apr 28 '24 03:04 qinxuye

This issue is stale because it has been open for 7 days with no activity.

Aug 06 '24 19:08 github-actions[bot]

This issue was closed because it has been inactive for 5 days since being marked as stale.

Aug 12 '24 03:08 github-actions[bot]