LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

model.float() OOM as #3510

Open lk137095576 opened this issue 9 months ago • 1 comments

Reminder

  • [X] I have read the README and searched the existing issues.

Reproduction

https://github.com/hiyouga/LLaMA-Factory/issues/3510 看下面的图,反向是用的fp16的,计算之后才后cast为32的。 从下面微软介绍deepspeed的视频也可以看到,反向的时候,用的也是16。 https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/

这里提到过,但是我不知道如何reopen issue,所以新提了一个。

Expected behavior

No response

System Info

No response

Others

No response

lk137095576 avatar May 07 '24 07:05 lk137095576

我今天又看了下,我测试的是14B,如果是72B full的话,即使是80G的显卡也撑不过model.float()

lk137095576 avatar May 13 '24 03:05 lk137095576

fixed

hiyouga avatar May 15 '24 15:05 hiyouga