dgl icon indicating copy to clipboard operation
dgl copied to clipboard

Investigate graphs load time cost and improve

Open peizhou001 opened this issue 2 years ago • 1 comments

🔨Work Item

IMPORTANT:

  • This template is only for dev team to track project progress. For feature request or bug report, please use the corresponding issue templates.
  • DO NOT create a new work item if the purpose is to fix an existing issue or feature request. We will directly use the issue in the project tracker.

Project tracker: https://github.com/orgs/dmlc/projects/2

Description

DGL graphs load is quarter the speed of PYG againt different graphs, and the gap increase when repeated number of graphs increase, we should investigate why this happens and try to improve. DGL

Graph number Size in disk(Mb) Load time(Seconds)
100K 105 42
1000K 1047 422
5000K 5235 2067

PYG

Graph number Size in disk(Mb) Load time(Seconds)
100K 105 9.72
1000K 1057 95.53
5000K 5246 473.26

Depending work items or issues

peizhou001 avatar Sep 26 '22 03:09 peizhou001

I remember it's more efficient to save/load a batched graph rather than a list of graphs, though I do not have detailed numbers here.

mufeili avatar Sep 26 '22 05:09 mufeili

@peizhou001 to clarify the problem, many graph, or single graph. Let's decide the plan later.

frozenbugs avatar Mar 15 '23 02:03 frozenbugs