FastChat issues

FastChat with API using only one processor core on CPU for output generation

3

I am running this on CPU, and I see that if I provide a lot of input the time it processes this with all available CPU cores takes longer than...

linus-ahlemeyer

bug

Support logprob in OpenAI API

1

Thank you so much for your fantastic work. I meet a small problem and really hope you can help me. After building up OpenAI API, I try to send 'logprobs=1'...

wymanCV

good first issue

How to enable batch evaluation? I got RuntimeError: CUDA error: device-side assert triggered

To accelerate evaluation, I want to generate with multiple prompts rather than only one prompt. But I got following CUDA error. Can someone help this? **error**： 131 ../aten/src/ATen/native/cuda/Indexing.cu:1093: indexSelectSmallIndex: block:...

pkumc

module 'fastchat' has no attribute 'load_model'

2

Hello I tried to install fastchat with this command `pip3 install fschat` But I didn't succeed because when I execute my python script ``` #!/usr/bin/python3.10 import fschat model = fschat.load_model("lmsys/fastchat-t5-3b-v1.0")...

Sorio6

I have some questions, such as how to create a public link that can open the dialog interface in your browser

1

![4d148beb3ef0b0a1e44b48edca3375d](https://github.com/lm-sys/FastChat/assets/39661319/adfaa3ab-5e89-4634-8d6b-1195de114272)

Hzzhang-nlp

api_server runs too slowly

2

when I use the api_server, I found that it runs too slowly. With the same hardware and environment, web_server is 3-5 times faster. What is the reason for this and...

ShuxunoO

How to use a single GPU for training?

1

Hi! I use a single gpu A100(40G). ``` export NCCL_IB_DISABLE=1; export NCCL_P2P_DISABLE=1; export NCCL_DEBUG=INFO; export NCCL_SOCKET_IFNAME=en,eth,em,bond; export CXX=g++; deepspeed --num_gpus 1 --num_nodes 1 \ fastchat/train/train_mem.py \ --model_name_or_path ../hf-llama-7B \ --data_path...

aresa7796

Hello everyone, can we perform Lora or other fine-tuning based on the Vicuna model? How much graphics memory is required?

1

Hello everyone, can we perform Lora or other fine-tuning based on the Vicuna model? How much graphics memory is required?

qimingyangyang

Model worker keeps on registering and gets de registered

I have a kubernetes deployment. The model worker runs on a separate node and the controller/api-server run on a different pod. But the model worker keeps on registering and in...

saurabhgssingh

python3 -m fastchat.serve.model_worker returns status_code 403

2

I start the fastchat controller by default configuration: `python3 -m fastchat.serve.controller` However, when i registered the model_worker, it failed at the assert code which checks whether the status_code equals to...

wanbo432503

FastChat
FastChat copied to clipboard

Metadata

FastChat with API using only one processor core on CPU for output generation

Support logprob in OpenAI API

How to enable batch evaluation? I got RuntimeError: CUDA error: device-side assert triggered

module 'fastchat' has no attribute 'load_model'

I have some questions, such as how to create a public link that can open the dialog interface in your browser

api_server runs too slowly

How to use a single GPU for training?

Hello everyone, can we perform Lora or other fine-tuning based on the Vicuna model? How much graphics memory is required?

Model worker keeps on registering and gets de registered

python3 -m fastchat.serve.model_worker returns status_code 403

← Metadata

Owner

Metadata

FastChat FastChat copied to clipboard

Metadata

← Metadata

Owner

Metadata

FastChat
FastChat copied to clipboard