FastChat
FastChat copied to clipboard
Training new Vicuna based on fully open-source OpenLLaMA
Hi :wave:
I was wondering if there is any ongoing incentives for training a new Vicuna model based on the fully open source OpenLLaMA? This would ultimately remove the requirement on Vicuna license to be a derivative of the Meta LLaMA non-commercial bespoke license?
Regards,
That would be absolutely awesome 😎
That would be great!
Vicuna is the only self-hosted model, I've tried, which could execute complex prompts within langchain with very few changes. If we have OpenVicuna
, there would be an e2e open source stack, from model to application.
Hello! isn't this the official public release for Vicuna on HuggingFace?
NOTE: This "delta model" cannot be used directly.
Users have to apply it on top of the original LLaMA weights to get actual Vicuna weights.
Does the statement above mean that I can't just load it from the HF hub and fine-tune it directly? because I tried to and the training loss is constant across all batches (10.375). I thought maybe this is the reason Any help is most appreciated!
What do you mean? OpenLlaMA is an open source model trained on the RedPajama dataset. You mean fine tune it using the same method Vicuna was fine tuned from LlaMA?
What do you mean? OpenLlaMA is an open source model trained on the RedPajama dataset. You mean fine tune it using the same method Vicuna was fine tuned from LlaMA?
Yeah, that's it. It would get rid of LLaMA's non commercial license.
Is there any plan since OpenLLaMA-7B and OpenLLaMA-13 have been fully trained and released?
However, Vicuna does not release the training data.
OpenLlama-based Vicuna would be so awesome to have.