FastChat issues

Does vicuna-13b/vicuna-7b provides flash-attention implementation during inference using this repo. If yes, where is the implementation?

Not an issue exactly. But couldn't figure. In training scrip, I believe it replaces all the multi-head attn with flash attn but not sure about what's happening in inference. Any...

Tacacs-1101

Text garbage

7

after installing everything i have garbage text on any question that i send to chat. ![vicuna](https://github.com/lm-sys/FastChat/assets/45124181/d39d0a5c-7062-4034-a277-3f0e7a82fbd1)

Edan3blov

can i use this code to conduct self-supervised pre-training based on llama-7b

can i use this code to conduct self-supervised pre-training based on llama-7b? (Not the dialog format given in the example, My dataset is pieces of text) if i can, which...

nicosouth

Language distribution of ShareGPT 70K conversation dataset for FastChat T5

2

What are all the languages present in the ShareGPT 70,000 conversation dataset which was used to fine-tune FastChat-T5? The ReadMe file points to [`data_cleaning.md`](https://github.com/lm-sys/FastChat/blob/main/docs/commands/data_cleaning.md) which was used to get data...

Mihir2

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!

lucasjinreal

Training new Vicuna based on fully open-source OpenLLaMA

5

Hi :wave: I was wondering if there is any ongoing incentives for training a new Vicuna model based on the fully open source [OpenLLaMA](https://github.com/openlm-research/open_llama)? This would ultimately remove the requirement...

wilhelmagren

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

8

fastchat==0.2.9 transformers is also the newest (commit a2789addd) vicuna-13b-delta-v1.1 ![image](https://github.com/lm-sys/FastChat/assets/59643844/e30c1608-245c-432d-9be9-dec5761f56ef)

LetsGoFir

FastChat
FastChat copied to clipboard

Metadata

Does vicuna-13b/vicuna-7b provides flash-attention implementation during inference using this repo. If yes, where is the implementation?

Text garbage

can i use this code to conduct self-supervised pre-training based on llama-7b

Language distribution of ShareGPT 70K conversation dataset for FastChat T5

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!

Training new Vicuna based on fully open-source OpenLLaMA

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

Does the Fine-tuning Vicuna-7B with Local GPUs is full parameter fine-tuning? Does it use the rola?

Finetune 13b model on V100 with bs1 in int8 got OOM?

Query Regarding API for Vicuna Model

← Metadata

Owner

Metadata

FastChat FastChat copied to clipboard

Metadata

← Metadata

Owner

Metadata

FastChat
FastChat copied to clipboard