blocksparse icon indicating copy to clipboard operation
blocksparse copied to clipboard

libcudart.so.9.0: cannot open shared object file: No such file or directory

Open mmxmb opened this issue 7 years ago • 5 comments

Trying to recreate the example and get the following error when importing from blocksparse.matmul import BlocksparseMatMul

---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
<ipython-input-1-1dea216b89b0> in <module>()
----> 1 from blocksparse.matmul import BlocksparseMatMul
      2 import tensorflow as tf
      3 import numpy as np

~/anaconda/lib/python3.6/site-packages/blocksparse/matmul.py in <module>()
     11 from tensorflow.python.framework import ops
     12 from tensorflow.python.ops.init_ops import Initializer
---> 13 import blocksparse.ewops as ew
     14 
     15 data_files_path = tf.resource_loader.get_data_files_path()

~/anaconda/lib/python3.6/site-packages/blocksparse/ewops.py in <module>()
     15 
     16 data_files_path = tf.resource_loader.get_data_files_path()
---> 17 _op_module = tf.load_op_library(os.path.join(data_files_path, 'blocksparse_ops.so'))
     18 # for x in dir(_op_module):
     19 #     print(x)

~/anaconda/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py in load_op_library(library_filename):
     54   """
     55   with errors_impl.raise_exception_on_not_ok_status() as status:
---> 56     lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
     57 
     58   op_list_str = py_tf.TF_GetOpList(lib_handle)

~/anaconda/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    471             None, None,
    472             compat.as_text(c_api.TF_Message(self.status.status)),
--> 473             c_api.TF_GetCode(self.status.status))
    474     # Delete the underlying status object from memory otherwise it stays alive
    475     # as there is a reference to status from this from the traceback due to

NotFoundError: libcudart.so.9.0: cannot open shared object file: No such file or directory

I believe I have all the prerequisites:

  • Python 3.6.2
  • CUDA Version 8.0.61 (from /usr/local/cuda/version.txt)
  • tensorflow-gpu (1.4.1)
  • Ubuntu 16.04

I am running an AWS p2.xlarge instance; it uses a single Kepler GPU (K80).

Edit:

Tried this again on another instance that uses Maxwell architecture, since it is recommended (GPU+ at paperspace.com).

Apart from different GPU, the only other difference on that instance is Python 3.6.3.

Still get the same error.

mmxmb avatar Dec 16 '17 20:12 mmxmb

Same issue. Tensorflow alone is working. Running:

  • Ubuntu 16.04
  • Python 3.5.4
  • CUDA 8.0.44
  • cudnn 6.0
  • tf 1.4.1
  • GPU: 1080ti

simicvm avatar Dec 21 '17 08:12 simicvm

Next time I should read before asking. Solved the issue by following instruction in Development section

simicvm avatar Dec 21 '17 08:12 simicvm

Works for me as well, by following instruction in Development section. Issue can be closed.

dchichkov avatar Jan 23 '18 23:01 dchichkov

@simama @dchichkov I am following instructions from the Installation section. Building from source is usually required when you want to modify the source, which is not what I want to do. I understand that I can use the library by following the Development section, but in that case the Installation section in README should be fixed.

mmxmb avatar Jan 24 '18 00:01 mmxmb

Some issue. CUDA 8.0 don't match GCC 5.4.You can try install GCC 4.

yiwusuorao avatar Nov 08 '18 13:11 yiwusuorao