libiomp5.so already initialized
I am getting this error when I try to run ODL with a TensorFlow model:
OMP: Error #15: Initializing libiomp5.so, but found libiomp5.so already initialized. OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
This occurs when I am using TensorFlow with Intel MKL-DNN (which is the default on the Anaconda repository).
I've spoken to the Intel TensorFlow team and they think that ODL might be trying to access a second libiomp5.so in the environment that isn't being used by TensorFlow MKL-DNN or is somehow blocking it.
Would there be anyone willing to work with the Intel team to resolve the conflict? I can help with the introduction.
Thanks! Best, -Tony
Hello!
In case you are using ASTRA, the conflict is likely technically not with ODL but with ASTRA, which is calling OMP under the hood. Would there be some way for you to try calling ASTRA without ODL?
Otherwise we'd need a more extensive example. Of course we want to solve this.
Thanks so much. Yes I tried calling without Astra as well and get the same results.
I've got a very simple example:
import tensorflow as tf
import odl
import odl.contrib.tensorflow
sess = tf.Session()
size_x = 1024 # It will work for smaller numbers like 128
size_y = 1024 # It will work for smaller numbers like 128
upsampling = [1, 1]
space = odl.uniform_discr([-int(size_x/2), -int(size_y/2)], [int(size_x/2), int(size_y/2)], [size_x, size_y],dtype='float32')
angle_partition = odl.uniform_partition(0, 3.1415, 90)
detector_partition = odl.uniform_partition(-int(size_x/2), int(size_y/2), size_x)
geometry = odl.tomo.Parallel2dGeometry(angle_partition, detector_partition)
print("Ok here.")
operator = odl.tomo.RayTransform(space, geometry)
print("Fails on next line")
pseudoinverse = odl.tomo.fbp_op(operator)
print("Won't get to this line.")
For the conda environment:
conda create -n bug -y -c anaconda pip python=3.6 tensorflow scikit-image
conda activate bug
conda install -c odlgroup odl
Hi @tonyreina! I ran your minimal example and unfortunately I can't reproduce your error. On my machine (Linux), everything runs without problems.
On which platform are you working?
Regarding packages that use OpenMP, there are likely a bunch in our (optional) dependencies, but we don't explicitly load it ourselves, so we can't really do anything about this issue.
The error that you observe, is it raised by Tensorflow? The error message says that "multiple copies of the OpenMP runtime have been linked into the program. To me that looks more like an issue with the compiled package (Tensorflow or whoever raised the error).
I can't reproduce this, either. My output:
2019-01-29 20:38:15.153319: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-01-29 20:38:15.162335: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
Ok here.
/home/banert/miniconda3/envs/bug/lib/python3.6/site-packages/odl/tomo/operators/ray_trafo.py:144: RuntimeWarning: The best available backend ('skimage') may be too slow for volumes of this size. Consider using ASTRA. This warning can be disabled by explicitly setting `impl='skimage'`.
RuntimeWarning)
Fails on next line
Won't get to this line.
Thanks. Yes. I'm using the pre-compiled TensorFlow package from Anaconda. So to install TensorFlow I am doing:
conda install -c anaconda tensorflow
The Intel MKL-DNN library being used by that version of TensorFlow I think is also linking to the libiomp5.so.
Best. -Tony
If I use just the pip install tensorflow it gets the non-MKL-DNN version of TensorFlow and works ok. However, the MKL-DNN one is significantly faster than the non-MKL-DNN one (for regular TensorFlow models).
conda install -c anaconda tensorflow gives the result
Solving environment: done
# All requested packages already installed.
I still can't reproduce the error, working with the following packages:
# packages in environment at /home/banert/miniconda3/envs/bug:
#
# Name Version Build Channel
_tflow_select 2.3.0 mkl anaconda
absl-py 0.7.0 py36_0 anaconda
astor 0.7.1 py36_0 anaconda
blas 1.0 mkl anaconda
c-ares 1.15.0 h7b6447c_1 anaconda
ca-certificates 2018.12.5 0 anaconda
certifi 2018.11.29 py36_0 anaconda
cloudpickle 0.6.1 py36_0 anaconda
cycler 0.10.0 py36_0 anaconda
cytoolz 0.9.0.1 py36h14c3975_1 anaconda
dask-core 1.0.0 py36_0 anaconda
dbus 1.13.6 h746ee38_0 anaconda
decorator 4.3.0 py36_0 anaconda
expat 2.2.6 he6710b0_0 anaconda
fontconfig 2.13.0 h9420a91_0 anaconda
freetype 2.9.1 h8a8886c_1 anaconda
future 0.17.1 py36_0
gast 0.2.2 py36_0 anaconda
glib 2.56.2 hd408876_0 anaconda
grpcio 1.16.1 py36hf8bcb03_1 anaconda
gst-plugins-base 1.14.0 hbbd80ab_1 anaconda
gstreamer 1.14.0 hb453b48_1 anaconda
h5py 2.9.0 py36h7918eee_0 anaconda
hdf5 1.10.4 hb1b8bf9_0 anaconda
icu 58.2 h211956c_0 anaconda
imageio 2.4.1 py36_0 anaconda
intel-openmp 2019.1 144 anaconda
jpeg 9b habf39ab_1 anaconda
keras-applications 1.0.6 py36_0 anaconda
keras-preprocessing 1.0.5 py36_0 anaconda
kiwisolver 1.0.1 py36hf484d3e_0 anaconda
libedit 3.1.20181209 hc058e9b_0 anaconda
libffi 3.2.1 h4deb6c0_3 anaconda
libgcc-ng 8.2.0 hdf63c60_1 anaconda
libgfortran-ng 7.3.0 hdf63c60_0 anaconda
libpng 1.6.36 hbc83047_0 anaconda
libprotobuf 3.6.1 hd408876_0 anaconda
libstdcxx-ng 8.2.0 hdf63c60_1 anaconda
libtiff 4.0.10 h2733197_1001 anaconda
libuuid 1.0.3 h1bed415_2 anaconda
libxcb 1.13 h1bed415_1 anaconda
libxml2 2.9.9 he19cac6_0 anaconda
markdown 3.0.1 py36_0 anaconda
matplotlib 3.0.2 py36h5429711_0 anaconda
mkl 2019.1 144 anaconda
mkl_fft 1.0.10 py36ha843d7b_0 anaconda
mkl_random 1.0.2 py36hd81dba3_0 anaconda
ncurses 6.1 he6710b0_1 anaconda
networkx 2.2 py36_1 anaconda
numpy 1.15.4 py36h7e9f1db_0 anaconda
numpy-base 1.15.4 py36hde5b4d6_0 anaconda
odl 0.7.0 py36_0 odlgroup
olefile 0.46 py36_0 anaconda
openssl 1.1.1 h7b6447c_0 anaconda
packaging 18.0 py36_0
pcre 8.42 h439df22_0 anaconda
pillow 5.4.1 py36h34e0f95_0 anaconda
pip 18.1 py36_0 anaconda
protobuf 3.6.1 py36he6710b0_0 anaconda
pyparsing 2.3.1 py36_0 anaconda
pyqt 5.9.2 py36h22d08a2_1 anaconda
python 3.6.8 h0371630_0 anaconda
python-dateutil 2.7.5 py36_0 anaconda
pytz 2018.9 py36_0 anaconda
pywavelets 1.0.1 py36hdd07704_0 anaconda
qt 5.9.7 h5867ecd_1 anaconda
readline 7.0 h7b6447c_5 anaconda
scikit-image 0.14.1 py36he6710b0_0 anaconda
scipy 1.2.0 py36h7c811a0_0 anaconda
setuptools 40.6.3 py36_0 anaconda
sip 4.19.13 py36he6710b0_0 anaconda
six 1.12.0 py36_0 anaconda
sqlite 3.26.0 h7b6447c_0 anaconda
tensorboard 1.12.2 py36he6710b0_0 anaconda
tensorflow 1.12.0 mkl_py36h69b6ba0_0 anaconda
tensorflow-base 1.12.0 mkl_py36h3c3e929_0 anaconda
termcolor 1.1.0 py36_1 anaconda
tk 8.6.8 hbc83047_0 anaconda
toolz 0.9.0 py36_0 anaconda
tornado 5.1.1 py36h7b6447c_0 anaconda
werkzeug 0.14.1 py36_0 anaconda
wheel 0.32.3 py36_0 anaconda
xz 5.2.4 h14c3975_4 anaconda
zlib 1.2.11 h7b6447c_3 anaconda
@tonyreina Thanks for clarifying. I used exactly the instructions you posted to reproduce the error, and I couldn't observe it on my system.
Maybe you have a stray OpenMP library on your LD_LIBRARY_PATH that gets discovered first? If you run ldconfig -v | grep -B 5 libiomp5.so, do you get multiple hits?
Anything left to do here?