FastChat issues

Support Bailing LLM from ALIPAY for #3487

4

## Why are these changes needed? This PR implements the support for Bailing reasoning. The Bailing LLM provides the HTTP end point for reasoning and can be acessed by HTTP...

cuauty

How to implement weight decay towards the pre-trained model?

Hello, let me one question. If using FastChat for supervised fune-tuning, how do I implement penalizing the distance between starting and current weights? This was shown to be effective in...

sedol1339

Hardcoded host localhost and port 9090 for a rate monitor

2

In gradio_web_server.py, there is a hardcoded host and port for a supposed monitor daemon, but no daemon is around: https://github.com/lm-sys/FastChat/blob/a04072e35d0893e64169c8e3ea153312eb0fe9ac/fastchat/serve/gradio_web_server.py#L391 This, in turn, breaks the usage of openAI endpoints with...

surak

Fix #3501 (VLLM): TypeError: top_k must be an integer, got float

VLLM needs top_j to be int, not float ## Why are these changes needed? For VLLM to work again ## Related issue number (if applicable) Closes #3501 ## Checks -...

surak

Learderboard category/filter by model size

On those that tell their "size", it would be interesting to have a category or filter "size". Additionally, I would say to always have the overall top 3 (for comparison)...

Noriller

an issue for repeat a command

root@c68c31f45482:/workspace/zt/code/FastChat# python3 -m fastchat.model.apply_delta --base-model-path ../../model/Llama-2-7b-hf --target-model-path ../Sequence-Scheduling/ckpts/vicuna-7b --delta-path lmsys/vicuna-7b-delta-v1.1 Loading the delta weights from lmsys/vicuna-7b-delta-v1.1 You are using the default legacy behaviour of the . This is expected, and...

Noblezhong

Add support for embedding models: Text2Vec, M3E, GTE

## Why are these changes needed? ## Related issue number (if applicable) ## Checks - [x] I've run `format.sh` to lint the changes in this PR. - [x] I've included...

ETZhangSX

Easily break the justice of judgement by asking: who are you

1

Hi, I found that we can easily break the justice of judgement by asking: who are you. Then we can know the model name and vote the model with bias....

yueliu1999

Add vllm_worker support for lora_modules

1

## usage ### start ```bash export VLLM_WORKER_MULTIPROC_METHOD=spawn CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python3 -m fastchat.serve.vllm_worker \ --model-path /data/models/Qwen/Qwen2-72B-Instruct \ --tokenizer /data/models/Qwen/Qwen2-72B-Instruct \ --enable-lora \ --lora-modules m1=/data/modules/lora/adapter/m1 m2=/data/modules/lora/adapter/m2 m3=/data/modules/lora/adapter/m3 \ --model-names qwen2-72b-instruct,m1,m2,m3\ --controller http://localhost:21001 \...

x22x22

cốt truyện của những tác phẩm của tác giả thạch lam khác gì so với các tác giả khác

Rio-tt

FastChat
FastChat copied to clipboard

Metadata

Support Bailing LLM from ALIPAY for #3487

How to implement weight decay towards the pre-trained model?

Hardcoded host localhost and port 9090 for a rate monitor

Fix #3501 (VLLM): TypeError: top_k must be an integer, got float

Learderboard category/filter by model size

an issue for repeat a command

Add support for embedding models: Text2Vec, M3E, GTE

Easily break the justice of judgement by asking: who are you

Add vllm_worker support for lora_modules

cốt truyện của những tác phẩm của tác giả thạch lam khác gì so với các tác giả khác

← Metadata

Owner

Metadata

FastChat FastChat copied to clipboard

Metadata

← Metadata

Owner

Metadata

FastChat
FastChat copied to clipboard