Yixuan Zhu

Results 2 comments of Yixuan Zhu

qwen1.5移除了chat和chat_stream,可以参考https://qwen.readthedocs.io/en/latest/inference/chat.html 只要修改bridge_qwen_local.py中的llm_stream_generator即可 ``` device = get_conf('LOCAL_MODEL_DEVICE') system_prompt = get_conf('INIT_SYS_PROMPT') ``` ``` def llm_stream_generator(self, **kwargs): def adaptor(kwargs): query = kwargs['query'] max_length = kwargs['max_length'] top_p = kwargs['top_p'] temperature = kwargs['temperature'] history =...

已测试qwen1.5的14b,32b,72b,用官方的transformers推理速度都很慢,建议使用vllm或者llama.cpp部署