cuda_hook
cuda_hook copied to clipboard
BUG: `dlopen("/usr/local/cuda/targets/x86_64-linux/lib/libcublas.so", RTLD_NOW | RTLD_LOCAL)` failed, symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference
When execute tensorflow minist train task, occur the problem, 'Check failed: cublas_handle'.
It caused by dlopen, the complete command is dlopen("/usr/local/cuda/targets/x86_64-linux/lib/libcublas.so", RTLD_NOW | RTLD_LOCAL)
. And error throwed by dlopen is 'Failed to open /usr/local/cuda/targets/x86_64-linux/lib/libcublas.so: /usr/local/cuda/targets/x86_64-linux/lib/libcublas.so: symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference'.
According to ldd and nm, libcublas.so depend on libcublasLt.so.11, which linked to '/home/chenqian/Code/cuda_hook/output/lib64/libcublasLt.so.11'. And, there is no symbol free_gemm_select
in both '/home/chenqian/Code/cuda_hook/output/lib64/libcublasLt.so.11' and '/usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so.11'.
Moreover, if without cuda hook, the train task can complete.