FastChat issues

training data?

where is the training data for the model that it was trained on? is this in a git repo or on huggingface somepalce?

Which version of gpt-4 is used to generate the mt-bench scores on lmsys leaderboard?

Which version of gpt-4 is used to generate the mt-bench scores on lmsys leaderboard https://chat.lmsys.org/?leaderboard ? Is it gpt-4-0613 or gpt-4-0314?

YizeMinimax

Will the cache kv become invalid?

In a multi-threaded situation, if the GPU server resources are insufficient, will cache kv preemption occur? For example, there are two conversations at the same time, both of which are...

oslijunw

[BUG] RuntimeError: NPU out of memory. Tried to allocate 268.00 MiB

1

``` python3 -m fastchat.serve.cli --model-path /home/models/Qwen1.5-32B-Chat --device npu --gpus 0,1,2,3 (fast_chat) [root@localhost ~]# python3 -m fastchat.serve.cli --model-path /home/models/Qwen1.5-32B-Chat --device npu --gpus 0,1,2,3 /root/miniconda3/envs/fast_chat/lib/python3.8/site-packages/torch_npu/dynamo/__init__.py:18: UserWarning: Register eager implementation for the 'npu'...

WangxuP

How can I use Multiple NPUs ?

1

for example : `python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.5 --num-gpus 2` maybe this following command need to be supported? `python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.5 --device npu --num-npus 2` ![image](https://github.com/lm-sys/FastChat/assets/163284515/3bd23cfc-02d6-4c2c-84b4-8e30583beefc) ![image](https://github.com/lm-sys/FastChat/assets/163284515/4476c8ba-c4ad-4e3f-bd2b-eb60b690bc57)

QuentinWang1

Support NousResearch/Hermes-2-Pro-Mistral-7B

Could you add support NousResearch/Hermes-2-Pro-Mistral-7B model

mikutsky

启动模型的时候指定gpu报错

1

命令： python -m fastchat.serve.cli --model-path ~/data/model/chatglm3-6b --gpus 2 报错信息： Traceback (most recent call last): File "/root/miniconda3/envs/ragllm/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/miniconda3/envs/ragllm/lib/python3.10/runpy.py", line 86, in _run_code...

ilovecomet

Inaccessible Leaderboard with Screenreader

Hi, I'm a blind user, and I can't access leaderboard.lmsys.org with a screen reader. I tried accessing the website with many screen readers including Jaws, NVDA, Narrator on Windows, and...

chigkim

fix: garbled Chinese in the conv.json

4

## Why are these changes needed? I need save the battle logs in the `conv.json`, and then I found that garbled Chinese in the file. The reason is `json.dumps` the...

DSYZayn

add tensorRT model worker

1

## Why are these changes needed? [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) can greatly improve the inference speed of LLM. It would be helpful to support tensorRT-LLM in Fastchat. This commit simply implements how to...

WHDY

FastChat
FastChat copied to clipboard

Metadata

training data?

Which version of gpt-4 is used to generate the mt-bench scores on lmsys leaderboard?

Will the cache kv become invalid?

[BUG] RuntimeError: NPU out of memory. Tried to allocate 268.00 MiB

How can I use Multiple NPUs ?

Support NousResearch/Hermes-2-Pro-Mistral-7B

启动模型的时候指定gpu报错

Inaccessible Leaderboard with Screenreader

fix: garbled Chinese in the conv.json

add tensorRT model worker

← Metadata

Owner

Metadata

FastChat FastChat copied to clipboard

Metadata

← Metadata

Owner

Metadata

FastChat
FastChat copied to clipboard