accelerate Accumulate the gradients of 2 models

Accumulate the gradients of 2 models

Open lzqsd opened this issue 1 year ago • 3 comments

Thanks for the great work! I met an issue where I hoped to accumulate the gradients of 2 networks. How can I handle it with accelerator.accumulate() function? Looking forward to hearing from you!

May 11 '23 19:05 lzqsd

cc @muellerzr

May 11 '23 19:05 sgugger

This is a limitation on PyTorch and we have a mildly working hack, once we've tested it a bit more independently + verify with PyTorch it's "expected" behavior we'll bring it into Accelerate directly

May 11 '23 20:05 muellerzr

Thanks for the quick response! Looking forward to it! I will close this issue once it is included.

May 11 '23 20:05 lzqsd

accelerate accelerate copied to clipboard

Accumulate the gradients of 2 models

accelerate
accelerate copied to clipboard