dgl icon indicating copy to clipboard operation
dgl copied to clipboard

Get stuck when trying to generation random bipartite with 2B edges

Open Rhett-Ying opened this issue 2 years ago • 2 comments

🐛 Bug

I tried with below code to generate random bipartite. It succeeded with 1B edges( takes about 800 seconds) but get stuck with 2B edges(more than a few hours). No error/exception is thrown. I tried to dig in and find that below insert always failed since selected.size() > 1073500*1000. 1073500*1000 is a rough number, not deterministic, but always failed around it. Seems RandInt() always returns duplicate value. https://github.com/dmlc/dgl/blob/6e1be69a84ba3e17e8e4db3c3768448f3620ecf4/src/random/cpu/choice.cc#L95-L99

To Reproduce

Steps to reproduce the behavior:

machine: x2idn.16xlarge, 1T CPU RAM, 64 CPUs.

num_nodes = 5 * 1000 * 1000
num_edges = 2 * 1000 * 1000 * 1000
num_src_nodes = num_nodes//2
num_dst_nodes = num_nodes - num_src_nodes
tic = time.time()
t_tic = tic
g = dgl.rand_bipartite('node1', 'edge', 'node2',
                       num_src_nodes, num_dst_nodes, num_edges)

Expected behavior

generation should finish in linear increase of 1B edges cases, or error/exception should be thrown.

Environment

  • DGL Version (e.g., 1.0): master, 0.9.x
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3):
  • OS (e.g., Linux): Linux
  • How you installed DGL (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version (if applicable):
  • GPU models and configuration (e.g. V100):
  • Any other relevant information:

Additional context

Rhett-Ying avatar Jul 25 '22 01:07 Rhett-Ying

Could you figure out what the population and the number of samples are? Since when sampling without replacement we are doing rejection sampling, if the number of samples is very big then this indeed will take a long time.

BarclayII avatar Jul 25 '22 05:07 BarclayII

population: 2500K*2500K = 6250B sample: 2B

Rhett-Ying avatar Jul 25 '22 06:07 Rhett-Ying

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

github-actions[bot] avatar Aug 26 '22 01:08 github-actions[bot]