LLaMA-Factory
LLaMA-Factory copied to clipboard
model.float() OOM as #3510
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
https://github.com/hiyouga/LLaMA-Factory/issues/3510 看下面的图,反向是用的fp16的,计算之后才后cast为32的。 从下面微软介绍deepspeed的视频也可以看到,反向的时候,用的也是16。 https://www.microsoft.com/en-us/research/blog/zero-deepspeed-new-system-optimizations-enable-training-models-with-over-100-billion-parameters/
这里提到过,但是我不知道如何reopen issue,所以新提了一个。
Expected behavior
No response
System Info
No response
Others
No response
我今天又看了下,我测试的是14B,如果是72B full的话,即使是80G的显卡也撑不过model.float()
fixed