blocksparse icon indicating copy to clipboard operation
blocksparse copied to clipboard

tensorflow.python.framework.errors_impl.NotFoundError: libtensorflow_framework.so: cannot open shared object file: No such file or directory

Open yhCyan opened this issue 5 years ago • 4 comments

I run the code sparse attention on p100, but it goes wrong.

System information

  • OS Platform and Distribution: ubuntu 18.04
  • TensorFlow installed from: pip install
  • TensorFlow version: 1.14.0(gpu)
  • Python version: 3.6.5
  • GCC: 7.4.0
  • CUDA version: 10.0
  • GPU model and memory: P100

Here's the traceback

Traceback (most recent call last):
  File "/home/zyh/sparse_attention-master/attention.py", line 4, in <module>
    from blocksparse import BlocksparseTransformer
  File "/root/anaconda3/lib/python3.6/site-packages/blocksparse/__init__.py", line 3, in <module>
    from blocksparse.utils import (
  File "/root/anaconda3/lib/python3.6/site-packages/blocksparse/utils.py", line 16, in <module>
    _op_module = tf.load_op_library(os.path.join(data_files_path, 'blocksparse_ops.so'))
  File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libtensorflow_framework.so: cannot open shared object file: No such file or directory

I check the libtensorflow_framework.so through find . -name libtensorflow_framework.so , however, it doesn't exist. Next, I find libtensorflow_framework.so1 at /root/anaconda3/lib/python3.6/site-packages/tensorflow/, so I copy the libtensorflow_framework.so1 to the libtensorflow_framework.so, and I append it to LD_LIBRARY_PATH through export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"path/to/your/libtensorflow". It still not works.

Please help. Thanks in advance

yhCyan avatar Aug 15 '19 14:08 yhCyan

I also the get same problem with very similar environment, except that my Tensorflow was installed from Conda.

andrewjungdg avatar Aug 21 '19 20:08 andrewjungdg

I'm fairly certain that you are using a tensorflow version that is too new. See here: https://github.com/tensorflow/tensorflow/issues/30175

Tensorflow changed the name of its shared object file in version 1.14. Maybe 1.13 will work for you, though. You can also install blocksparse from source to avoid this problem. Only Scott or someone from openai can comment on what version of tensorflow the wheel installed by pip was linked against.

My guess is that the SONAME of that file also changed as well, which is why your copying may not be working. I am not sure, though, and won't debug this myself.

galv avatar Aug 21 '19 20:08 galv

I installed tensorflow 1.13.1 and ran the code, but it goes wrong again.

I'm fairly certain that you are using a tensorflow version that is too new. See here: tensorflow/tensorflow#30175

Tensorflow changed the name of its shared object file in version 1.14. Maybe 1.13 will work for you, though. You can also install blocksparse from source to avoid this problem. Only Scott or someone from openai can comment on what version of tensorflow the wheel installed by pip was linked against.

My guess is that the SONAME of that file also changed as well, which is why your copying may not be working. I am not sure, though, and won't debug this myself.

wasdfghjklr avatar Nov 09 '19 12:11 wasdfghjklr

I came with the same problem. It turns out to be using horovod 0.16 with tensorflow 1.14. Force reinstalling horovod 0.15 and the problem disappears.

serser avatar Sep 11 '21 03:09 serser