dgl icon indicating copy to clipboard operation
dgl copied to clipboard

[DistGB] `dispatch_data.py` with `--use-graphbolt` crashed

Open Rhett-Ying opened this issue 4 months ago • 0 comments

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Warning: Permanently added '[127.0.0.1]:2222' (ED25519) to the list of known hosts.
[5ea156f179a8 INFO 2024-10-03 23:29:47,836 PID:2471] [Rank: 0] Done with process group initialization...
[5ea156f179a8 INFO 2024-10-03 23:29:47,837 PID:2471] [Rank: 0] Starting distributed data processing pipeline...
[5ea156f179a8 INFO 2024-10-03 23:29:47,841 PID:2471] [Rank: 0] Initialized metis partitions and node_types map...
[5ea156f179a8 INFO 2024-10-03 23:29:47,862 PID:2471] [Rank: 0] Done reading dataset /data/ml-100k
[5ea156f179a8 INFO 2024-10-03 23:29:47,862 PID:2471] [Rank: 0] Done augmenting file input data with auxilary columns
[5ea156f179a8 INFO 2024-10-03 23:29:47,895 PID:2471] [Rank: 0] Total time for feature exchange: 0:00:00.032553
[5ea156f179a8 INFO 2024-10-03 23:29:47,895 PID:2471] [Rank: 0] Total time for feature exchange: 0:00:00.000001
[5ea156f179a8 INFO 2024-10-03 23:29:47,931 PID:2471] [Rank: 0] Time to send/rcv edge data: 0:00:00.032726
[5ea156f179a8 INFO 2024-10-03 23:29:48,497 PID:2471] There are 48071 edges in partition 0
[Rank: 0 Edge data is already sorted !!!
[rank0]: Traceback (most recent call last):
[rank0]:   File "/root/dgl/tools/distpartitioning/data_proc_pipeline.py", line 134, in <module>
[rank0]:     multi_machine_run(params)
[rank0]:   File "/root/dgl/tools/distpartitioning/data_shuffle.py", line 1499, in multi_machine_run
[rank0]:     gen_dist_partitions(rank, params.world_size, params)
[rank0]:   File "/root/dgl/tools/distpartitioning/data_shuffle.py", line 1327, in gen_dist_partitions
[rank0]:     ) = create_graph_object(
[rank0]:   File "/root/dgl/tools/distpartitioning/convert_partition.py", line 714, in create_graph_object
[rank0]:     indptr, indices, csc_edge_ids = _process_partition_gb(
[rank0]:   File "/root/dgl/tools/distpartitioning/convert_partition.py", line 355, in _process_partition_gb
[rank0]:     return indptr, indices[sorted_idx], edge_ids[sorted_idx]
[rank0]: UnboundLocalError: local variable 'sorted_idx' referenced before assignment

Expected behavior

Environment

  • DGL Version (e.g., 1.0): latest nightly build
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3):
  • OS (e.g., Linux):
  • How you installed DGL (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version (if applicable):
  • GPU models and configuration (e.g. V100):
  • Any other relevant information:

Additional context

Rhett-Ying avatar Oct 05 '24 00:10 Rhett-Ying