TransformerEngine
TransformerEngine copied to clipboard
Why requires_grad attribute of weight from offloading will set to False ?
https://github.com/NVIDIA/TransformerEngine/blob/e3bb24e5a347c58353e62307bc84cca856f9e9be/transformer_engine/pytorch/module/linear.py#L405-L407
if the weight.requires_grad set to False, when to calculate and accumulate weight grads?