FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Results 766 FastChat issues
Sort by recently updated
recently updated
newest added
trafficstars

I am finetuning vicuna using 4 * A100-80G GPUs. I meet some problem after finish training, ``` {'loss': 1.3641, 'learning_rate': 4.815273327803183e-08, 'epoch': 0.97} {'loss': 1.35, 'learning_rate': 2.7095433213097933e-08, 'epoch': 0.97} {'loss':...

HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'vicuna-7B/'....

Thanks a lot for the great contribution! Here are the training logs: ...... ...... {'loss': 0.6356, 'learning_rate': 1.9382459105399634e-05, 'epoch': 0.42} {'loss': 0.6391, 'learning_rate': 1.938015964960626e-05, 'epoch': 0.42} {'loss': 0.5389, 'learning_rate': 1.9377856057588756e-05,...

finetune with lora CUDA_VISIBLE_DEVICES="2,3,4,5,6,7" torchrun --nnodes=1 --nproc_per_node=6 \ fastchat/train/train_lora.py \ --model_name_or_path vicuna/vicuna-7b \ --data_path vicuna/data/data.json \ --fp16 \ --report_to none \ --output_dir ./checkpoints \ --num_train_epochs 3 \ --per_device_train_batch_size 1 \...

- [ ] Support cli inference of Flan-T5 - [ ] Support web UI serving of Flan-T5 - [ ] Support fine-tuning of Flan-T5

good first issue

Hello, Thank you for sharing your awesome work! I'm trying to train Vicuna on my own dataset. I walked through the installation process from source. I had to install `pytorch`...

Hi there, I am trying to fine tune vicuna-7b with 2 GTX 3090 cards. ```bash torchrun --nnodes=1 --nproc_per_node=2 \ fastchat/train/train_mem.py \ --model_name_or_path vicuna-7b \ --data_path playground/data/alpaca-data-conversation.json \ --bf16 True \...

the UI is not filtering input/output appropriately

good first issue
help wanted

When using CUDA, there appears to be a memory leak on Windows systems with either the CLI or UI. Any messages sent to the model will cause the GPU memory...

I tried to finetune Vicuna using my own data with deepspeed, however, I met the following error: ![error](https://user-images.githubusercontent.com/128484317/230714825-8a47be2b-ecaa-4c03-a802-24bf7cfc6c69.PNG) I tried to solve this error by changing torch and deepspeed version,...