dgl icon indicating copy to clipboard operation
dgl copied to clipboard

update dgl to cuda 12.4 pytorch 2.4.x got error "FileNotFoundError: Cannot find DGL C++ sparse library at /opt/conda/envs/torch124/lib/python3.11/site-packages/dgl/dgl_sparse/libdgl_sparse_pytorch_2.4.0.post301.so"

Open NicksonCheng opened this issue 1 year ago • 9 comments

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Expected behavior

Environment

  • DGL Version (e.g., 1.0):
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3):
  • OS (e.g., Linux):
  • How you installed DGL (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version (if applicable):
  • GPU models and configuration (e.g. V100):
  • Any other relevant information:

Additional context

update dgl to cuda 12.4 pytorch 2.4.x got error "FileNotFoundError: Cannot find DGL C++ sparse library at /opt/conda/envs/torch124/lib/python3.11/site-packages/dgl/dgl_sparse/libdgl_sparse_pytorch_2.4.0.post301.so"

NicksonCheng avatar Sep 09 '24 05:09 NicksonCheng

hey! i have the same problem. i'm working on a windows machine in colab in python 3.10.

i have relatively no issues installing it:

Installing collected packages: torchdata, dgl Successfully installed dgl-2.1.0+cu121 torchdata-0.8.0

but when i execute the code i get this:

FileNotFoundError Traceback (most recent call last) in <cell line: 1>() ----> 1 import dgl 2 import networkx as nx 3 import matplotlib.pyplot as plt 4 5 dgl.backend = 'pytorch'

6 frames /usr/local/lib/python3.10/dist-packages/dgl/graphbolt/init.py in load_graphbolt() 43 path = os.path.join(dirname, "graphbolt", basename) 44 if not os.path.exists(path): ---> 45 raise FileNotFoundError( 46 f"Cannot find DGL C++ graphbolt library at {path}" 47 )

FileNotFoundError: Cannot find DGL C++ graphbolt library at /usr/local/lib/python3.10/dist-packages/dgl/graphbolt/libgraphbolt_pytorch_2.4.0.so

jansole avatar Sep 17 '24 10:09 jansole

I think something is wrong with pip wheels provided for this project. I am getting the same error, where the libgraphbolt_pytorch_2.4.1.so is not being found and indeed, when listing that directory, the library is missing. The documentation is misleading because it suggests that you can install this without any issues using pip and that is not the case.

I only got the Conda installation to work but this is quite frustrating when you're working in a Docker container, to not be able to install this via vanilla pip.

kjczarne avatar Sep 18 '24 21:09 kjczarne

In the Jenkinsfile the libraries that seem to be copied at build are defined as:

dgl_linux_libs = 'build/libdgl.so, build/runUnitTests, python/dgl/_ffi/_cy3/core.cpython-*-x86_64-linux-gnu.so, build/tensoradapter/pytorch/*.so, build/dgl_sparse/*.so, build/graphbolt/*.so'

I suspect then something is broken with graphbolt libs, perhaps only those for older versions of PyTorch are built when the workflow is triggered. Here are those that I have found in the installed dgl package:

libgraphbolt_pytorch_2.0.0.so
libgraphbolt_pytorch_2.0.1.so
libgraphbolt_pytorch_2.1.0.so
libgraphbolt_pytorch_2.1.1.so
libgraphbolt_pytorch_2.1.2.so
libgraphbolt_pytorch_2.2.0.so
libgraphbolt_pytorch_2.2.1.so

kjczarne avatar Sep 18 '24 21:09 kjczarne

the issue is from .post301, this suffix is not expected. please try to re-install the latest torch version and torch.__verison__ is supposed to be 2.4.1 without any suffix.

Rhett-Ying avatar Sep 25 '24 03:09 Rhett-Ying

In the Jenkinsfile the libraries that seem to be copied at build are defined as:

dgl_linux_libs = 'build/libdgl.so, build/runUnitTests, python/dgl/_ffi/_cy3/core.cpython-*-x86_64-linux-gnu.so, build/tensoradapter/pytorch/*.so, build/dgl_sparse/*.so, build/graphbolt/*.so'

I suspect then something is broken with graphbolt libs, perhaps only those for older versions of PyTorch are built when the workflow is triggered. Here are those that I have found in the installed dgl package:

libgraphbolt_pytorch_2.0.0.so
libgraphbolt_pytorch_2.0.1.so
libgraphbolt_pytorch_2.1.0.so
libgraphbolt_pytorch_2.1.1.so
libgraphbolt_pytorch_2.1.2.so
libgraphbolt_pytorch_2.2.0.so
libgraphbolt_pytorch_2.2.1.so

how you solve this? I met the same issue.

gaowayne avatar Oct 20 '24 13:10 gaowayne

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

github-actions[bot] avatar Nov 20 '24 01:11 github-actions[bot]

I am getting a similar error - "FileNotFoundError: Cannot find DGL C++ graphbolt library at /mxg-hpc/users/dpa13/miniforge3/envs/modulus/lib/python3.10/site-packages/dgl/graphbolt/libgraphbolt_pytorch_2.4.1.so" any solutions to this? I tried torch 2.4.0+cu124 version as well.

deepakpokkalla avatar Jan 09 '25 20:01 deepakpokkalla

Im also getting the error. I have installed pytorch==2.2.1 and dgl=2.1.0. Does anyone have tips?

EdgarDevNat avatar Jan 23 '25 17:01 EdgarDevNat

In my practice, you can find a suitable version of torch, cuda, and dgl from this website, and download it according to the instructions to avoid this problem. https://www.dgl.ai/pages/start.html

whatismynamehe avatar Feb 04 '25 09:02 whatismynamehe