torchrec icon indicating copy to clipboard operation
torchrec copied to clipboard

Wrap DP tables with DDP after reset parameters

Open henrylhtsang opened this issue 10 months ago • 3 comments

Summary:

Problem

Problem is once we wrap DP tables with DDP, the parameters are the same, but not synced. So if we reset the parameters in the table without using the same manual seed, it could cause the DP table parameters to be different.

In other words, DP tables would be initialized differently.

Fix

There are a few ways to fix it. This is the way we believe to be least invasive and follow the spirit of the api.

What we do:

  1. Delay the DDP wrapping to reset_parameters
  2. rewrap everytime we call reset_parameters

Differential Revision: D55227979

henrylhtsang avatar Apr 10 '24 18:04 henrylhtsang

This pull request was exported from Phabricator. Differential Revision: D55227979

facebook-github-bot avatar Apr 10 '24 18:04 facebook-github-bot

cc @YLGH

henrylhtsang avatar Apr 10 '24 21:04 henrylhtsang

my bad

YLGH avatar Apr 10 '24 21:04 YLGH