dgl icon indicating copy to clipboard operation
dgl copied to clipboard

ImportError: Cannot load Graphbolt C++ library

Open yfismine opened this issue 1 year ago • 8 comments

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

conda install pytorch=2.3.0 cpuonly torchmetrics=1.4.0 -c pytorch -c conda-forge -y
conda install dgl=2.2.1 -c dglteam/label/th23_cpu -y
python -c "import dgl"

It works normally on x86, but there is a error on arm.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/miniconda3/lib/python3.12/site-packages/dgl/__init__.py", line 16, in <module>
    from . import (
  File "/root/miniconda3/lib/python3.12/site-packages/dgl/dataloading/__init__.py", line 13, in <module>
    from .dataloader import *
  File "/root/miniconda3/lib/python3.12/site-packages/dgl/dataloading/dataloader.py", line 27, in <module>
    from ..distributed import DistGraph
  File "/root/miniconda3/lib/python3.12/site-packages/dgl/distributed/__init__.py", line 5, in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
  File "/root/miniconda3/lib/python3.12/site-packages/dgl/distributed/dist_graph.py", line 11, in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
  File "/root/miniconda3/lib/python3.12/site-packages/dgl/graphbolt/__init__.py", line 36, in <module>
    load_graphbolt()
  File "/root/miniconda3/lib/python3.12/site-packages/dgl/graphbolt/__init__.py", line 33, in load_graphbolt
    raise ImportError("Cannot load Graphbolt C++ library")
ImportError: Cannot load Graphbolt C++ library

Environment

  • DGL Version (e.g., 1.0): 2.2.1 cpu
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3): 2.3.0
  • OS (e.g., Linux): arm centos8
  • How you installed DGL (conda, pip, source): conda
  • Python version: 3.12.3

yfismine avatar Jul 31 '24 08:07 yfismine

Facing same issue - circular dependency where I get the above with torch == 2.3.0 and DGL >= 2.1.0, or ModuleNotFoundError: No module named 'torch.utils._import_utils' with torch < 2.3.0 and dgl ==2.1.0

MarkTraquair avatar Jul 31 '24 20:07 MarkTraquair

@Rhett-Ying Why don't we import GraphBolt inside the functions where it is used so that normal DGL users can import DGL. Or the distributed code should not be imported by a general DGL import.

mfbalin avatar Jul 31 '24 20:07 mfbalin

I see torchdata was updated 7 hrs ago, and that seems to be one of the root causes of the ModuleNotFoundError: No module named 'torch.utils._import_utils issue if I am understanding things correctly.

    from matgl.graph.converters import GraphConverter
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/matgl/graph/converters.py:7: in <module>
    import dgl
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/dgl/__init__.py:16: in <module>
    from . import (
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/dgl/dataloading/__init__.py:13: in <module>
    from .dataloader import *
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/dgl/dataloading/dataloader.py:27: in <module>
    from ..distributed import DistGraph
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/dgl/distributed/__init__.py:5: in <module>
    from .dist_graph import DistGraph, DistGraphServer, edge_split, node_split
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/dgl/distributed/dist_graph.py:11: in <module>
    from .. import backend as F, graphbolt as gb, heterograph_index
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/dgl/graphbolt/__init__.py:8: in <module>
    from .base import *
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/dgl/graphbolt/base.py:8: in <module>
    from torchdata.datapipes.iter import IterDataPipe
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torchdata/datapipes/__init__.py:11: in <module>
    from . import iter, map, utils
/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/torchdata/datapipes/iter/__init__.py:79: in <module>
    from torchdata.datapipes.iter.util.cacheholder import (
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    # Copyright (c) Meta Platforms, Inc. and affiliates.
    # All rights reserved.
    #
    # This source code is licensed under the BSD-style license found in the
    # LICENSE file in the root directory of this source tree.
    
    import hashlib
    import inspect
    import os.path
    import sys
    import time
    import uuid
    import warnings
    
    from collections import deque
    from functools import partial
    from typing import Any, Callable, Deque, Dict, Iterator, List, Optional, Tuple, TypeVar
    
    try:
        import portalocker
    except ImportError:
        portalocker = None
    
>   from torch.utils._import_utils import dill_available
E   ModuleNotFoundError: No module named 'torch.utils._import_utils'

Andrew-S-Rosen avatar Jul 31 '24 22:07 Andrew-S-Rosen

I see torchdata was updated 7 hrs ago, and that seems to be one of the root causes of the ModuleNotFoundError: No module named 'torch.utils._import_utils issue if I am understanding things correctly.

Can you try using a torchdata version <0.8.0? If anyone verifies that resolves the issue, we can merge #7604.

mfbalin avatar Jul 31 '24 22:07 mfbalin

@Rhett-Ying Why don't we import GraphBolt inside the functions where it is used so that normal DGL users can import DGL. Or the distributed code should not be imported by a general DGL import.

Can you reproduce my question? There is a problem on arm.

yfismine avatar Aug 01 '24 02:08 yfismine

@yfismine We've stopped providing built packages for ARM. could you try to build on your own to see if this issue still exists?

Rhett-Ying avatar Aug 01 '24 02:08 Rhett-Ying

Although there are many strange compatibility problems, the version I compiled through the source code can work normally in arm. Thank you very much. @Rhett-Ying

yfismine avatar Aug 04 '24 10:08 yfismine

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

github-actions[bot] avatar Sep 04 '24 01:09 github-actions[bot]