Hao Zhang
Hao Zhang
@zxzhijia your chatbot is behind GFW?
stale issue. Closing.
Please use v1.1 weight meanwhile use the latest version of fastchat and transformers.
Yes, the issue was caused HF's refactoring on the llama tokenizer, which has been fixed by us later. Please make sure to use the latest version of fastchat and vicuna-v1.1...
@laidybug why don't you submit a PR and let us take a look?
Duplicated with #170. Please monitor that thread for updates.
@ShoubhikBanerjee Please follow the instructions **step-by-step** to get llama weights, then vicuna weights, and then run apply_delta.
refer to this page how to get the llama weights. https://huggingface.co/docs/transformers/main/model_doc/llama
refer to this reply: https://github.com/lm-sys/FastChat/issues/543#issuecomment-1520909606 it is unlikely you can fine-tune any version of vicuna with only 96gb total VRAM.
For now, I think you can try to get GPTQ vicuna in other ecosystems like GPT4all.