FastChat
FastChat copied to clipboard
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
I am finetuning vicuna using 4 * A100-80G GPUs. I meet some problem after finish training, ``` {'loss': 1.3641, 'learning_rate': 4.815273327803183e-08, 'epoch': 0.97} {'loss': 1.35, 'learning_rate': 2.7095433213097933e-08, 'epoch': 0.97} {'loss':...
HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'vicuna-7B/'....
Thanks a lot for the great contribution! Here are the training logs: ...... ...... {'loss': 0.6356, 'learning_rate': 1.9382459105399634e-05, 'epoch': 0.42} {'loss': 0.6391, 'learning_rate': 1.938015964960626e-05, 'epoch': 0.42} {'loss': 0.5389, 'learning_rate': 1.9377856057588756e-05,...
finetune with lora CUDA_VISIBLE_DEVICES="2,3,4,5,6,7" torchrun --nnodes=1 --nproc_per_node=6 \ fastchat/train/train_lora.py \ --model_name_or_path vicuna/vicuna-7b \ --data_path vicuna/data/data.json \ --fp16 \ --report_to none \ --output_dir ./checkpoints \ --num_train_epochs 3 \ --per_device_train_batch_size 1 \...
- [ ] Support cli inference of Flan-T5 - [ ] Support web UI serving of Flan-T5 - [ ] Support fine-tuning of Flan-T5
Hello, Thank you for sharing your awesome work! I'm trying to train Vicuna on my own dataset. I walked through the installation process from source. I had to install `pytorch`...
Hi there, I am trying to fine tune vicuna-7b with 2 GTX 3090 cards. ```bash torchrun --nnodes=1 --nproc_per_node=2 \ fastchat/train/train_mem.py \ --model_name_or_path vicuna-7b \ --data_path playground/data/alpaca-data-conversation.json \ --bf16 True \...
the UI is not filtering input/output appropriately
When using CUDA, there appears to be a memory leak on Windows systems with either the CLI or UI. Any messages sent to the model will cause the GPU memory...
I tried to finetune Vicuna using my own data with deepspeed, however, I met the following error:  I tried to solve this error by changing torch and deepspeed version,...