dgl icon indicating copy to clipboard operation
dgl copied to clipboard

[GraphBolt] items are not shuffled across the whole set if `num_workers>0`

Open Rhett-Ying opened this issue 1 year ago • 1 comments

🔨Work Item

IMPORTANT:

  • This template is only for dev team to track project progress. For feature request or bug report, please use the corresponding issue templates.
  • DO NOT create a new work item if the purpose is to fix an existing issue or feature request. We will directly use the issue in the project tracker.

Project tracker: https://github.com/orgs/dmlc/projects/2

Description

split item before shuffle results in significant accuracy drop. We should shuffle across the whole set first, then split items among workers. It's worth checking if torch.DL shuffle in this way.

buffer_size of ItemShufflerAndBatcher could always be len(item_set) to simplify the code logic.

Depending work items or issues

Rhett-Ying avatar Jan 14 '24 13:01 Rhett-Ying

For ItemSampler , it's fixed in https://github.com/dmlc/dgl/pull/6982. One side effect of this fix is all workers are using same seed generator for DistributedItemSampler's shuffle.

Rhett-Ying avatar Jan 23 '24 09:01 Rhett-Ying