dgl icon indicating copy to clipboard operation
dgl copied to clipboard

DGL for CUDA 12.0

Open sumone-compbio opened this issue 1 year ago • 14 comments

Hi, the remote server of my institute has CUDA 12.0. Is it possible to use DGL on this server since there's no official release for cu120 yet?

sumone-compbio avatar Apr 26 '23 13:04 sumone-compbio

You can install CUDA in conda environment, and then install dgl. Refer this for CUDA installation: https://anaconda.org/nvidia/cuda

k-styles avatar Apr 26 '23 18:04 k-styles

Hi @sumenties,

Currently DGL does not officially support CUDA 12.0 as Pytorch does not support it. We will support it as soon as Pytorch offers support on it.

czkkkkkk avatar Apr 27 '23 04:04 czkkkkkk

@czkkkkkk Suppose I don't wish to use GPU (even CPU would be sufficient for me) would DGL still require CUDA? I'm using DeepChem and some of its functions require DGL.

sumone-compbio avatar May 03 '23 13:05 sumone-compbio

@k-styles I tried but there's some error while importing deepchem in a conda environment. I need DGL for some of the tasks that can only be done from deepchem.

sumone-compbio avatar May 03 '23 13:05 sumone-compbio

Suppose I don't wish to use GPU (even CPU would be sufficient for me) would DGL still require CUDA? I'm using DeepChem and some of its functions require DGL.

Sure. You can use the CPU version of DGL in the case.

czkkkkkk avatar May 03 '23 14:05 czkkkkkk

@czkkkkkk I tried, this is the error I get:

DGLError: [15:53:54] /opt/dgl/src/runtime/c_runtime_api.cc:82: Check failed: allow_missing: Device API cuda is not enabled. Please install the cuda version of dgl. Stack trace: [bt] (0) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4f) [0x7f9cccf1570f] [bt] (1) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(dgl::runtime::DeviceAPIManager::GetAPI(std::string, bool)+0x37c) [0x7f9ccd1cc28c] [bt] (2) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(dgl::runtime::DeviceAPI::Get(DGLContext, bool)+0x1e3) [0x7f9ccd1c6863] [bt] (3) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(dgl::runtime::NDArray::Empty(std::vector<long, std::allocator >, DGLDataType, DGLContext)+0x15b) [0x7f9ccd1e8dfb] [bt] (4) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(dgl::runtime::NDArray::CopyTo(DGLContext const&) const+0xc0) [0x7f9ccd225e80] [bt] (5) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(dgl::aten::COOMatrix::CopyTo(DGLContext const&) const+0x7d) [0x7f9ccd34a07d] [bt] (6) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(dgl::UnitGraph::CopyTo(std::shared_ptrdgl::BaseHeteroGraph, DGLContext const&)+0x2aa) [0x7f9ccd33942a] [bt] (7) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(dgl::HeteroGraph::CopyTo(std::shared_ptrdgl::BaseHeteroGraph, DGLContext const&)+0xf5) [0x7f9ccd238025] [bt] (8) /home/sumit/.local/lib/python3.10/site-packages/dgl/libdgl.so(+0x446f3b) [0x7f9ccd246f3b]

sumone-compbio avatar May 04 '23 10:05 sumone-compbio

How do you call the copy_to? For the CPU DGL version, a DGLGraph cannot be copied to GPU.

czkkkkkk avatar May 11 '23 03:05 czkkkkkk

@czkkkkkk Sorry, I did not understand what you meant by that. I installed the CPU version of DGL and DGLGO using the command: pip install dgl -f https://data.dgl.ai/wheels/repo.html pip install dglgo -f https://data.dgl.ai/wheels-test/repo.html

Later, I installed dgllife using by: pip install dgllife

Other dependencies e.g. Pytorch, Apache mxnet, Tensorflow are already there in my system. I didn't get any warning while importing any of these libraries but when I run my GCN model through deepchem:

model = GCNModel(mode='classification', n_tasks=1, batch_size=16, learning_rate=0.001) loss = model.fit(dataset, nb_epoch=100)

It throws me the error above I showed you.

sumone-compbio avatar May 11 '23 08:05 sumone-compbio

The error occurs may because you are trying to call DGL functions on GPU Pytorch tensors. Could you check whether all the tensors you used are on CPU?

czkkkkkk avatar May 11 '23 09:05 czkkkkkk

No, they are not. How do I set them all to CPU?

sumone-compbio avatar May 11 '23 09:05 sumone-compbio

For example, you can avoid calling .to('cuda') for all tensors.

czkkkkkk avatar May 18 '23 02:05 czkkkkkk

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

github-actions[bot] avatar Jun 18 '23 01:06 github-actions[bot]

waiting for the DGL under CUDA 12.0 version!

Claudia-Hello avatar Feb 04 '24 07:02 Claudia-Hello

@Claudia-Hello please refer to https://www.dgl.ai/pages/start.html for DGL that support CUDA 12.1

Rhett-Ying avatar Feb 04 '24 07:02 Rhett-Ying