FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Training new Vicuna based on fully open-source OpenLLaMA

Open wilhelmagren opened this issue 1 year ago • 5 comments

Hi :wave:

I was wondering if there is any ongoing incentives for training a new Vicuna model based on the fully open source OpenLLaMA? This would ultimately remove the requirement on Vicuna license to be a derivative of the Meta LLaMA non-commercial bespoke license?

Regards,

wilhelmagren avatar May 10 '23 09:05 wilhelmagren

That would be absolutely awesome 😎

ogunoz avatar May 12 '23 18:05 ogunoz

That would be great! Vicuna is the only self-hosted model, I've tried, which could execute complex prompts within langchain with very few changes. If we have OpenVicuna, there would be an e2e open source stack, from model to application.

lslslslslslslslslslsls avatar May 15 '23 06:05 lslslslslslslslslslsls

Hello! isn't this the official public release for Vicuna on HuggingFace?

NOTE: This "delta model" cannot be used directly.
Users have to apply it on top of the original LLaMA weights to get actual Vicuna weights.

Does the statement above mean that I can't just load it from the HF hub and fine-tune it directly? because I tried to and the training loss is constant across all batches (10.375). I thought maybe this is the reason Any help is most appreciated!

sarrahbbh avatar May 17 '23 09:05 sarrahbbh

What do you mean? OpenLlaMA is an open source model trained on the RedPajama dataset. You mean fine tune it using the same method Vicuna was fine tuned from LlaMA?

tytung2020 avatar Jun 02 '23 15:06 tytung2020

What do you mean? OpenLlaMA is an open source model trained on the RedPajama dataset. You mean fine tune it using the same method Vicuna was fine tuned from LlaMA?

Yeah, that's it. It would get rid of LLaMA's non commercial license.

lslslslslslslslslslsls avatar Jun 05 '23 01:06 lslslslslslslslslslsls

Is there any plan since OpenLLaMA-7B and OpenLLaMA-13 have been fully trained and released?

PenutChen avatar Jun 20 '23 03:06 PenutChen

However, Vicuna does not release the training data.

John-Ge avatar Jul 12 '23 08:07 John-Ge

OpenLlama-based Vicuna would be so awesome to have.

jasontian6666 avatar Jul 12 '23 21:07 jasontian6666