FastChat
FastChat copied to clipboard
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
## Why are these changes needed? Adding code tagger for the arena dataset ## Checks - [x] I've run `format.sh` to lint the changes in this PR. - [x] I've...
## Why are these changes needed? CodeLlama 70B Instruct uses [a different format](https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf#chat-prompt) for the chat prompt than previous Llama 2 or CodeLlama models. In this branch: ``` $ python3...
## Why are these changes needed? This pull request introduces support for the new `cosmosage_v2` model into our system. The `cosmosage_v2` model is a specialized language model designed for providing...
Hello, Is there a way to print the SEED value of the response generated into the command prompt? I already notice you print important parameters like temperature, repetition penalty and...
{"object":"error","message":"[{'type': 'json_invalid', 'loc': ('body', 2), 'msg': 'JSON decode error', 'input': {}, 'ctx': {'error': 'Expecting property name enclosed in double quotes'}}]","code":40001} Why do I have json parsing issues when I use...
Hi! Thank you for this wonderful repo. When I was trying to load vicuna model with limited VRAM across different GPUs. I discovered that your "max_memory" part would cause the...
As described in https://huggingface.co/blog/codellama#conversational-instructions, the data format of conversations are formatted as follows: ``` [INST] {{ system_prompt }} {{ user_msg_1 }} [/INST] {{ model_answer_1 }} [INST] {{ user_msg_2 }} [/INST]...
In short, vLLM depends on pydantic >= 2, ``` pydantic >= 2.0 # Required for OpenAI server. ``` on the other hand, `fastchat/serve/openai_api_server.py` depends on v1.x ```python try: from pydantic.v1...
Hi there, I've come to the conclusion that the field repetition_penalty, which can be found [here](https://github.com/lm-sys/FastChat/blob/06acba1ea585b5cb77182a2dc1746f5228466d5c/fastchat/protocol/api_protocol.py#L62), is not being used. However, this field is supported by the vllm module. When...
Almost all of the battles are single-response "chats":  The responses of the two models are often so different that there is no meaningful way to continue the conversation with...