qlora icon indicating copy to clipboard operation
qlora copied to clipboard

Could it load and tune falcon-40B ?

Open znsoftm opened this issue 2 years ago • 4 comments

TIIUAC has released a promising model named falcon: https://huggingface.co/tiiuae . Can QLora load and tune it? It is totally free and can be used for commercial purposes.

znsoftm avatar Jun 02 '23 17:06 znsoftm

I loaded falcon-40b-instruct into 4 GPUs, 12.3GiB each.

It runs extremely slowly.

I have seen others reporting loading it into a single GPU with 48GiB VRAM and it was still extremely slow.

phalexo avatar Jun 02 '23 21:06 phalexo

Yes. I tested it with 8* A100, it occupates 12G on each GPU and still not fast, especially alter the input length more than 500.

znsoftm avatar Jun 03 '23 10:06 znsoftm

Yes. I tested it with 8* A100, it occupates 12G on each GPU and still not fast, especially alter the input length more than 500.

If falcon-40b produces high quality output, I would consider using it as a source of training data for a smaller model.

phalexo avatar Jun 03 '23 10:06 phalexo

I dont know how to run a model using multiple GPU nodes for inference can someone give me the code for it ?

mirdulagarwal1201 avatar Jul 22 '23 12:07 mirdulagarwal1201