Fine-tuning: Can train_mem.py run on CPU

Open tonyaw opened this issue 2 years ago • 2 comments

I want to fine-tuning vicuna via train_mem.py. It requires module flash_attn, and flash_attn requires nvcc. Based on that, I assume train_mem.py can only run on GPU. Is my understanding right?

Anyway to fine-tuning vicuna via CPU? I don't care the training time as I don't have GPU. :-)

Apr 21 '23 09:04 tonyaw

you can replace flash_atten with the normal attention in pytorch. Things will still work, despite that the training speed will be very slow.

May 08 '23 09:05 zhisbug

you can replace flash_atten with the normal attention in pytorch. Things will still work, despite that the training speed will be very slow.

Can you give detailed steps, thank you

Jun 13 '23 01:06 chamuyaye

@zhisbug can you please suggest the code changes we are supposed to make?

Jun 23 '23 08:06 saurabhmahra91