FastChat
FastChat copied to clipboard
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
## Why are these changes needed? This PR implements the support for Bailing reasoning. The Bailing LLM provides the HTTP end point for reasoning and can be acessed by HTTP...
Hello, let me one question. If using FastChat for supervised fune-tuning, how do I implement penalizing the distance between starting and current weights? This was shown to be effective in...
In gradio_web_server.py, there is a hardcoded host and port for a supposed monitor daemon, but no daemon is around: https://github.com/lm-sys/FastChat/blob/a04072e35d0893e64169c8e3ea153312eb0fe9ac/fastchat/serve/gradio_web_server.py#L391 This, in turn, breaks the usage of openAI endpoints with...
VLLM needs top_j to be int, not float ## Why are these changes needed? For VLLM to work again ## Related issue number (if applicable) Closes #3501 ## Checks -...
On those that tell their "size", it would be interesting to have a category or filter "size". Additionally, I would say to always have the overall top 3 (for comparison)...
root@c68c31f45482:/workspace/zt/code/FastChat# python3 -m fastchat.model.apply_delta --base-model-path ../../model/Llama-2-7b-hf --target-model-path ../Sequence-Scheduling/ckpts/vicuna-7b --delta-path lmsys/vicuna-7b-delta-v1.1 Loading the delta weights from lmsys/vicuna-7b-delta-v1.1 You are using the default legacy behaviour of the . This is expected, and...
## Why are these changes needed? ## Related issue number (if applicable) ## Checks - [x] I've run `format.sh` to lint the changes in this PR. - [x] I've included...
Hi, I found that we can easily break the justice of judgement by asking: who are you. Then we can know the model name and vote the model with bias....
## usage ### start ```bash export VLLM_WORKER_MULTIPROC_METHOD=spawn CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python3 -m fastchat.serve.vllm_worker \ --model-path /data/models/Qwen/Qwen2-72B-Instruct \ --tokenizer /data/models/Qwen/Qwen2-72B-Instruct \ --enable-lora \ --lora-modules m1=/data/modules/lora/adapter/m1 m2=/data/modules/lora/adapter/m2 m3=/data/modules/lora/adapter/m3 \ --model-names qwen2-72b-instruct,m1,m2,m3\ --controller http://localhost:21001 \...