FastChat
FastChat copied to clipboard
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Not an issue exactly. But couldn't figure. In training scrip, I believe it replaces all the multi-head attn with flash attn but not sure about what's happening in inference. Any...
after installing everything i have garbage text on any question that i send to chat. 
can i use this code to conduct self-supervised pre-training based on llama-7b? (Not the dialog format given in the example, My dataset is pieces of text) if i can, which...
What are all the languages present in the ShareGPT 70,000 conversation dataset which was used to fine-tune FastChat-T5? The ReadMe file points to [`data_cleaning.md`](https://github.com/lm-sys/FastChat/blob/main/docs/commands/data_cleaning.md) which was used to get data...
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
Hi :wave: I was wondering if there is any ongoing incentives for training a new Vicuna model based on the fully open source [OpenLLaMA](https://github.com/openlm-research/open_llama)? This would ultimately remove the requirement...
fastchat==0.2.9 transformers is also the newest (commit a2789addd) vicuna-13b-delta-v1.1 
Does the Fine-tuning Vicuna-7B with Local GPUs is full parameter fine-tuning? Does it use the rola?
Finetune 13b model on V100 with bs1 in int8 got OOM? I already enabled Deepspeed
Hello LMSYS Team, I am reaching out to inquire about the functionality of the Vicuna model. I would like to know if there is an available API for Vicuna that...