minyichen

Results 7 comments of minyichen

@yexing @zhudy Excuse me. I face the same problem. I cloned vllm into my project. and add `nvcc_cuda_version = get_nvcc_cuda_version(CUDA_HOME) ` to setup.py at line 268 But still have same...

@abidlabs Hi, I face the same problem. Do you have any suggestion? Thanks in advance!

@ZhuJD-China @JianxinMa 我這邊使用14B的結果也不甚理想, 會選擇工具, 但並不會實際使用工具 7B、72B是否對於ReAct prompt有加強學習, 甚至可以支援多輪ReAct? ``` from langchain.chat_models import ChatOpenAI from langchain.agents import load_tools, initialize_agent, AgentType from langchain import SerpAPIWrapper llm = ChatOpenAI( temperature=0, # max_tokens=90, streaming=...

使用QWEN-7B測試, 看起來似乎有使用工具, 但實際上並沒有真的去呼叫SERP_API ``` from langchain.chat_models import ChatOpenAI from langchain.agents import load_tools, initialize_agent, AgentType from langchain import SerpAPIWrapper llm = ChatOpenAI( temperature=0, # max_tokens=90, streaming= True, openai_api_key="EMPTY", openai_api_base="http://localhost:8000/v1", model_name="/usr/src/app/model/Qwen-7B-Chat-AWQ" )...

@chuangzhidan 我這邊的例子後端是用vllm架設的openai server, openai_api_base指得是後端API架設端口的endpoint model_name則是架設時設定的模型名子, 以Openai來說, 就是gpt-4、gpt-3.5-turbo... 這裡寫'/usr/src/app/model/Qwen-7B-Chat-AWQ', 是因為vllm若沒有設定架設時設定的模型名子, 預設是用路徑當作名子 若有疑問再請你發問, 以上說明~

@Facico 請教一下finetune_chat、finetune、finetune_deepspeed這三個檔案的差別是什麼?

@ninehills Is there any update on this? Or could you tell me in which version of vLLM this issue was resolved? In version vLLM-0.4.3, my tests show that the quantized...