torchrec icon indicating copy to clipboard operation
torchrec copied to clipboard

Turn the dummy_tensor's grad off

Open Microve opened this issue 1 year ago • 1 comments

Summary: bdhirsh introduced a change in D51418076 where intermediate leafs with grad will cause a graph break.

This leads to graph breaks in training our APS model. Example: P1156881935

The graph breaks happen on the dummy_tensor which requires grad. However this is not necessary. In the original diff D38469224, Ying has done an experiments showing that grad is not populated at all.

Therefore, we turn the grad off in this diff to avoid graph breaks on APS model training.

Differential Revision: D53449759

Microve avatar Feb 05 '24 23:02 Microve

This pull request was exported from Phabricator. Differential Revision: D53449759

facebook-github-bot avatar Feb 05 '24 23:02 facebook-github-bot