hdrnet_legacy icon indicating copy to clipboard operation
hdrnet_legacy copied to clipboard

tensorflow.python.framework.errors_impl.NotFoundError: /home/research/data/hdrnet/hdrnet/lib/hdrnet_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE Backend TkAgg is interactive backend. Turning interactive mode on.

Open qinghua2016 opened this issue 7 years ago • 17 comments

when I run the command: python train.py, it occures the error as follows: tensorflow.python.framework.errors_impl.NotFoundError: /home/research/data/hdrnet/hdrnet/lib/hdrnet_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE Backend TkAgg is interactive backend. Turning interactive mode on. my tensorflow version is 1.1.0,do you know why?

qinghua2016 avatar Aug 07 '17 12:08 qinghua2016

@mgharbi got similar error here, it happens at

https://github.com/mgharbi/hdrnet/blob/master/hdrnet/hdrnet_ops.py#L27

Error Info:

tensorflow.python.framework.errors_impl.NotFoundError: hdrnet_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE

Environment:

ubuntu 14.04 tensorflow 1.1.0 cuda 8.0


I solved it by add CFLAGS = -fPIC -I$(TF_INC) -O2 -D_GLIBCXX_USE_CXX11_ABI=0 in Makefile

shun1024 avatar Aug 07 '17 22:08 shun1024

I unfortunately could not reproduce this error on gcc-5.0. Does adding the -D_GLIBCXX_USE_CXX11_ABI=0 flag help? I'll add it to the Makefile.

mgharbi avatar Aug 21 '17 21:08 mgharbi

I encountered a similar error when using gcc-5.0 on ubuntu 16.04. Adding the D_GLIBCXX_USE_CXX11_ABI=0 flag did fix it for me. (I adapted this commit from @tcassou from his fork)

dxue2012 avatar Aug 21 '17 21:08 dxue2012

Hi all,

  • As pointed out by @dxue2012 I had the same issue with Ubuntu 16.04and a version of gcc > 5.0.0, and adding the flag -D _GLIBCXX_USE_CXX11_ABI=0 solves it.
  • On OSX (tested with version 10.12.6), you have to replace this flag by -undefined dynamic_lookup.
  • I could run everything with trouble on CentOS 7 (except that CUDA was installed under a different path). Hope it helps!

tcassou avatar Aug 22 '17 17:08 tcassou

Merci Thomas, Feel free to pull-request your updates, otherwise I'll add those changes to the Makefile and close the issue.

mgharbi avatar Aug 22 '17 17:08 mgharbi

Salut Michael, I committed a few other small changes to my forked version of your repo (great work by the way!), so it's a bit less convenient to send a PR at this point... Related to the Makefile, I ended up inserting some comments, since I'm always switching between different machines, and did not want to push it much further (OS dependent Makefile):

# Use flag -D _GLIBCXX_USE_CXX11_ABI=0 for gcc > 5
# Use flag -undefined dynamic_lookup for OSX
CFLAGS = -fPIC -I$(TF_INC) -O2
LDFLAGS = -L$(CUDA_HOME)/lib64 -lcudart
# Use flag -D _GLIBCXX_USE_CXX11_ABI=0 for gcc > 5
NVFLAGS = -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -I $(TF_INC) \
					-expt-relaxed-constexpr -Wno-deprecated-gpu-targets -ftz=true

Thomas

tcassou avatar Aug 22 '17 18:08 tcassou

Hi to all, I have the same problem, but adding flag into Makefile doesn't help me and I still see the error. Anyone can know what can be the issue?

Ubuntu 16.04; Tensorflow 1.3 built with Bazel; CUDA 8.0

Rachnog avatar Aug 28 '17 13:08 Rachnog

@Rachnog I installed tensorflow-gpu v1.1 directly with pip, and did not build it myself, that could explain the difference.

tcassou avatar Aug 29 '17 09:08 tcassou

I installed tensorflow-gpu v1.0.1 and v1.3 with pip, and your solution does not work for me.

tisawe avatar Aug 30 '17 21:08 tisawe

Hi, Any update on this issue? I am getting the same error after building the Makefile from hdrnet directory. I am on ubuntu 16.04 with gcc and g++ 4.8, cuda 8.0, tensorflow 1.3. Also setting D_GLIBCXX_USE_CXX11_ABI=0 did not helped me. Can anyone help me with this?

22avinash avatar Nov 21 '17 09:11 22avinash

@22avinash @tisawe Did you solve the problem finally?

cchen156 avatar May 24 '18 04:05 cchen156

Hello, I am on Linux 9.4, tensorflow 1.1, cuda 8.0 Also setting D_GLIBCXX_USE_CXX11_ABI=0 did not helped me. Did you solve the problem finally? @mgharbi @22avinash @tisawe @cchen156

xxAna avatar Oct 25 '18 12:10 xxAna

Fixed it by setting -D _GLIBCXX_USE_CXX11_ABI=1, replacing CC = c++ with CC = g++ and converting system prior gcc version to 4.8

dongrongliang avatar Jan 18 '19 06:01 dongrongliang

Facing similar issue when we try to freeze the pretrained models ...

/hdrnet/lib/hdrnet_ops.so: undefined symbol: Z37BilateralSliceApplyGradKernelLauncherRKN5Eigen9GpuDeviceEPKfPKxS4_S6_S4_S6_S4_bPfS7_S7

Using gcc 4.8, Python 2.7, Ubuntu 16.04, TF: 1.12.0, CUDA 9.0 Also tried setting flags to -D _GLIBCXX_USE_CXX11_ABI=1 and -D _GLIBCXX_USE_CXX11_ABI=0 ..

Here is the Makefile : -

TF_INC ?= `python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())'`
TF_LIB ?= `python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())'`
# TF_INC ?= /usr/local/lib/python2.7/dist-packages/tensorflow/include
CUDA_HOME ?= /usr/local/cuda

SRC_DIR = ops

BUILD_DIR = build
LIB_DIR = lib

CC = g++ -std=c++11
NVCC = nvcc -std c++11
CFLAGS = -D_GLIBCXX_USE_CXX11_ABI=1 -I$(TF_INC)/external/nsync/public -L$(TF_LIB) -ltensorflow_framework -fPIC -I$(TF_INC)  
LDFLAGS = -L$(CUDA_HOME)/lib64 -lcudart
NVFLAGS = -x cu -Xcompiler -fPIC -I$(TF_INC) -I$(SRC_DIR)\
					-gencode=arch=compute_30,code=\"sm_30,compute_30\" \
					-expt-relaxed-constexpr -Wno-deprecated-gpu-targets -ftz=true --ptxas-options=-v -lineinfo


SRC = bilateral_slice.cc
CUDA_SRC = bilateral_slice.cu.cc
CUDA_OBJ = $(addprefix $(BUILD_DIR)/,$(CUDA_SRC:.cc=.o))
SRCS = $(addprefix $(SRC_DIR)/, $(SRC))

all: $(LIB_DIR)/hdrnet_ops.so

# Main library
$(LIB_DIR)/hdrnet_ops.so: $(CUDA_OBJ) $(LIB_DIR) $(SRCS)
	$(CC) -shared -o $@ $(SRCS) $(CUDA_OBJ) $(CFLAGS) $(LDFLAGS) 

# Cuda kernels
$(BUILD_DIR)/%.o: $(SRC_DIR)/%.cc $(BUILD_DIR)
	$(NVCC) -c  $< -o $@ $(NVFLAGS)

$(BUILD_DIR):
	mkdir -p $@


$(LIB_DIR):
	mkdir -p $@

clean:
	rm -rf $(BUILD_DIR) $(LIB_DIR)


anilsathyan7 avatar Jun 11 '19 13:06 anilsathyan7

encountered similar problem trying to import the hdrnet_ops, the error message is as below: NotFoundError: /home/wangxinrui/Downloads/hdr_models/hdrnet-master/hdrnet/lib/hdrnet_ops.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE I am working on ubuntu16.04, python3.6.8, cuda9.0 with cudnn7.3.1, tensorflow 1.12.0. I tried almost every methods discusses above (and other related issues) but still cannot correctly make the op. any suggestion or help are appreciated.

SystemErrorWang avatar Jun 12 '19 02:06 SystemErrorWang

Facing similar issue when we try to freeze the pretrained models ...

/hdrnet/lib/hdrnet_ops.so: undefined symbol: Z37BilateralSliceApplyGradKernelLauncherRKN5Eigen9GpuDeviceEPKfPKxS4_S6_S4_S6_S4_bPfS7_S7

Using gcc 4.8, Python 2.7, Ubuntu 16.04, TF: 1.12.0, CUDA 9.0 Also tried setting flags to -D _GLIBCXX_USE_CXX11_ABI=1 and -D _GLIBCXX_USE_CXX11_ABI=0 ..

Here is the Makefile : -

TF_INC ?= `python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())'`
TF_LIB ?= `python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())'`
# TF_INC ?= /usr/local/lib/python2.7/dist-packages/tensorflow/include
CUDA_HOME ?= /usr/local/cuda

SRC_DIR = ops

BUILD_DIR = build
LIB_DIR = lib

CC = g++ -std=c++11
NVCC = nvcc -std c++11
CFLAGS = -D_GLIBCXX_USE_CXX11_ABI=1 -I$(TF_INC)/external/nsync/public -L$(TF_LIB) -ltensorflow_framework -fPIC -I$(TF_INC)  
LDFLAGS = -L$(CUDA_HOME)/lib64 -lcudart
NVFLAGS = -x cu -Xcompiler -fPIC -I$(TF_INC) -I$(SRC_DIR)\
					-gencode=arch=compute_30,code=\"sm_30,compute_30\" \
					-expt-relaxed-constexpr -Wno-deprecated-gpu-targets -ftz=true --ptxas-options=-v -lineinfo


SRC = bilateral_slice.cc
CUDA_SRC = bilateral_slice.cu.cc
CUDA_OBJ = $(addprefix $(BUILD_DIR)/,$(CUDA_SRC:.cc=.o))
SRCS = $(addprefix $(SRC_DIR)/, $(SRC))

all: $(LIB_DIR)/hdrnet_ops.so

# Main library
$(LIB_DIR)/hdrnet_ops.so: $(CUDA_OBJ) $(LIB_DIR) $(SRCS)
	$(CC) -shared -o $@ $(SRCS) $(CUDA_OBJ) $(CFLAGS) $(LDFLAGS) 

# Cuda kernels
$(BUILD_DIR)/%.o: $(SRC_DIR)/%.cc $(BUILD_DIR)
	$(NVCC) -c  $< -o $@ $(NVFLAGS)

$(BUILD_DIR):
	mkdir -p $@


$(LIB_DIR):
	mkdir -p $@

clean:
	rm -rf $(BUILD_DIR) $(LIB_DIR)

Hi, did you solve your problem? I met the same issue as you.

ColdCodeCool avatar Jul 23 '19 03:07 ColdCodeCool

Custom ops are registered by linking against libtensorflow_framework.so in TensorFlow 1.4 and above.

So refactor Makefile as follows.

TF_CFLAGS ?= `python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))'`
TF_LFLAGS ?= `python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))'`

# TF_INC ?= /usr/local/lib/python2.7/dist-packages/tensorflow/include
CUDA_HOME ?= /usr/local/cuda-9.2 # Replace with your cuda home

SRC_DIR = ops

BUILD_DIR = build
LIB_DIR = lib

CC = c++ -std=c++11
NVCC = nvcc -std c++11
CFLAGS = -fPIC -O2 $(TF_CFLAGS)
LDFLAGS = -L$(CUDA_HOME)/lib64 -lcudart $(TF_LFLAGS)
NVFLAGS = -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $(TF_CFLAGS) \
					-expt-relaxed-constexpr -Wno-deprecated-gpu-targets -ftz=true


SRC = bilateral_slice.cc
CUDA_SRC = bilateral_slice.cu.cc
CUDA_OBJ = $(addprefix $(BUILD_DIR)/,$(CUDA_SRC:.cc=.o))
SRCS = $(addprefix $(SRC_DIR)/, $(SRC))

all: $(LIB_DIR)/hdrnet_ops.so

# Main library
$(LIB_DIR)/hdrnet_ops.so: $(CUDA_OBJ) $(LIB_DIR) $(SRCS)
	$(CC) -shared -o $@ $(SRCS) $(CUDA_OBJ) $(CFLAGS) $(LDFLAGS) 

# Cuda kernels
$(BUILD_DIR)/%.o: $(SRC_DIR)/%.cc $(BUILD_DIR)
	$(NVCC) -c  $< -o $@ $(NVFLAGS)

$(BUILD_DIR):
	mkdir -p $@


$(LIB_DIR):
	mkdir -p $@

clean:
	rm -rf $(BUILD_DIR) $(LIB_DIR)

stefanielinear avatar Mar 14 '20 08:03 stefanielinear