dgl icon indicating copy to clipboard operation
dgl copied to clipboard

Set Large Tensor as UnifiedTensor has error

Open zqj2333 opened this issue 2 years ago • 2 comments

Bug

When I attempt to set a large tensor as UnifiedTensor, there is an error; while small tensor doesn't.

code:

import torch  
import dgl  
a=torch.randn(4847571,256)  
a=a.share_memory_()  
a=dgl.contrib.UnifiedTensor(a,0)  

error:

Traceback (most recent call last):
  File "unifiedtensor.py", line 6, in <module>
    a=dgl.contrib.UnifiedTensor(a,0)
  File "/root/anaconda3/envs/dgl/lib/python3.8/site-packages/dgl/contrib/unified_tensor.py", line 78, in __init__
    self._array.pin_memory_()
  File "/root/anaconda3/envs/dgl/lib/python3.8/site-packages/dgl/_ffi/ndarray.py", line 322, in pin_memory_
    check_call(_LIB.DGLArrayPinData(self.handle))
  File "/root/anaconda3/envs/dgl/lib/python3.8/site-packages/dgl/_ffi/base.py", line 65, in check_call
    raise DGLError(py_str(_LIB.DGLGetLastError()))
dgl._ffi.base.DGLError: [11:59:20] /opt/dgl/src/runtime/cuda/cuda_device_api.cc:183: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: CUDA: OS call failed or operation not supported on this OS
Stack trace:
  [bt] (0) /root/anaconda3/envs/dgl/lib/python3.8/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4f) [0x7f46a73d0eaf]
  [bt] (1) /root/anaconda3/envs/dgl/lib/python3.8/site-packages/dgl/libdgl.so(dgl::runtime::CUDADeviceAPI::PinData(void*, unsigned long)+0xb4) [0x7f46a78a6814]
  [bt] (2) /root/anaconda3/envs/dgl/lib/python3.8/site-packages/dgl/libdgl.so(dgl::runtime::NDArray::PinData(DLTensor*)+0x16f) [0x7f46a771b58f]
  [bt] (3) /root/anaconda3/envs/dgl/lib/python3.8/site-packages/dgl/libdgl.so(DGLArrayPinData+0x6) [0x7f46a771b606]
  [bt] (4) /root/anaconda3/envs/dgl/lib/python3.8/lib-dynload/../../libffi.so.7(+0x69dd) [0x7f48671ba9dd]
  [bt] (5) /root/anaconda3/envs/dgl/lib/python3.8/lib-dynload/../../libffi.so.7(+0x6067) [0x7f48671ba067]
  [bt] (6) /root/anaconda3/envs/dgl/lib/python3.8/lib-dynload/_ctypes.cpython-38-x86_64-linux-gnu.so(_ctypes_callproc+0x319) [0x7f48662bf1e9]
  [bt] (7) /root/anaconda3/envs/dgl/lib/python3.8/lib-dynload/_ctypes.cpython-38-x86_64-linux-gnu.so(+0x13c95) [0x7f48662bfc95]
  [bt] (8) python(_PyObject_MakeTpCall+0x3bf) [0x563c08dfb13f]

Environment

  • DGL Version : 0.8.2+cu111
  • Backend Library & Version :Pytorch 1.10.1+cu111
  • OS (e.g., Linux):
  • How you installed DGL : pip
  • Python version: 3.8
  • CUDA/cuDNN version : 11.2
  • GPU models and configuration : A100

zqj2333 avatar Jul 09 '22 12:07 zqj2333

Cannot reproduce. Can you share more env information, e.g., OS, RAM, etc.?

yaox12 avatar Jul 11 '22 01:07 yaox12

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

github-actions[bot] avatar Aug 11 '22 01:08 github-actions[bot]

This issue is closed due to lack of activity. Feel free to reopen it if you still have questions.

github-actions[bot] avatar Aug 18 '22 01:08 github-actions[bot]