qlora
qlora copied to clipboard
Could it load and tune falcon-40B ?
TIIUAC has released a promising model named falcon: https://huggingface.co/tiiuae . Can QLora load and tune it? It is totally free and can be used for commercial purposes.
I loaded falcon-40b-instruct into 4 GPUs, 12.3GiB each.
It runs extremely slowly.
I have seen others reporting loading it into a single GPU with 48GiB VRAM and it was still extremely slow.
Yes. I tested it with 8* A100, it occupates 12G on each GPU and still not fast, especially alter the input length more than 500.
Yes. I tested it with 8* A100, it occupates 12G on each GPU and still not fast, especially alter the input length more than 500.
If falcon-40b produces high quality output, I would consider using it as a source of training data for a smaller model.
I dont know how to run a model using multiple GPU nodes for inference can someone give me the code for it ?