FastChat
FastChat copied to clipboard
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
## Why are these changes needed? According to the https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat, optional system message is supported. ``` User: {user_message_1} Assistant: {assistant_message_1}User: {user_message_2} Assistant: ``` ``` {system_message} User: {user_message_1} Assistant: {assistant_message_1}User: {user_message_2}...
## Why are these changes needed? Adding Llama-3 Tenyx Chat model and API flows ## Related issue number (if applicable) ## Checks - [X] I've run `format.sh` to lint the...
## Why are these changes needed? - The vllm_worker code does not catch exceptions: when I check vllm's SamplingParams function, there is parameter verification and the error ValueError is thrown...
When I used `vllm_worker` to deploy the vicuna model, the `--limit-worker-concurrency` settings was `3`. After running for a while, I found that the model could not work. From the log,...
Add eos_token_id from the generation config file so that Llama3 can perform inference correctly. ## Why are these changes needed? The addition of eos_token_id from the generation config file to...
## Why are these changes needed? The controller creates a heart beat thread to remove stale workers on expiration. Currently, the thread is not flagged as daemon. In usual usage...
According to openAI doc: https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options. The API provide the `stream_options` which can get token usage info for stream request. Please support this option for better rate-limit control
Hello FastChat Team, First and foremost, thank you for your exceptional work on FastChat and MT-Bench. The open-source contributions have been invaluable to the community. I noticed in your [blog](https://lmsys.org/blog/2023-06-22-leaderboard/#next-steps)...
It's not clear in the repo readme how I can use the FastChat UI to compare multiple LLMs on my local machine. I have these models served via FastAPIs and...