DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

Details about backward hooks in stage3, why detach outputs?

Open feifeibear opened this issue 4 years ago • 1 comments

Dear authors,

Thank you for the awesome works. I try to learn some implementation details and come across a small question. I doubt the meaning of the two following lines. I believe it is the same if you remove two lines and in this way, you may save some tmp memory. https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/zero/stage3.py#L503 https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/zero/stage3.py#L528

feifeibear avatar Apr 07 '21 07:04 feifeibear

+1 @tjruwase I also wondered why we have detach in the backward hook, isn't it breaking the computational graph? but deepspeed zero 3 is still running fine.

szhengac avatar Jan 31 '23 18:01 szhengac