accelerate
accelerate copied to clipboard
Accumulate the gradients of 2 models
Thanks for the great work! I met an issue where I hoped to accumulate the gradients of 2 networks. How can I handle it with accelerator.accumulate() function? Looking forward to hearing from you!
cc @muellerzr
This is a limitation on PyTorch and we have a mildly working hack, once we've tested it a bit more independently + verify with PyTorch it's "expected" behavior we'll bring it into Accelerate directly
Thanks for the quick response! Looking forward to it! I will close this issue once it is included.