netcdf-c icon indicating copy to clipboard operation
netcdf-c copied to clipboard

"multiple definition" linking error when compiling netCDF v4.9.0 with GCC 11.3.0

Open boegel opened this issue 2 years ago • 5 comments

When trying to compile netCDF v4.9.0 with GCC 11.3.0 + binutils 2.38 + CMake 3.23.1 on RHEL 8.4, we're hitting the following linker error:

/software/OpenMPI/4.1.4-GCC-11.3.0/bin/mpicc -fPIC -O2 -ftree-vectorize -march=native -fno-math-errno -fPIC -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -L/software/zstd/1.5.2-GCCcore-11.3.0/lib64 -L/software/zstd/1.5.2-GCCcore-11.3.0/lib -L/software/Szip/2.1.1-GCCcore-11.3.0/lib64 -L/software/Szip/2.1.1-GCCcore-11.3.0/lib -L/software/cURL/7.83.0-GCCcore-11.3.0/lib64 -L/software/cURL/7.83.0-GCCcore-11.3.0/lib -L/software/HDF5/1.13.1-gompi-2022.05/lib64 -L/software/HDF5/1.13.1-gompi-2022.05/lib -L/software/GCCcore/11.3.0/lib64 -L/software/GCCcore/11.3.0/lib -shared  -o lib__nczhdf5filters.so CMakeFiles/nczhdf5filters.dir/NCZhdf5filters.c.o  /software/HDF5/1.13.1-gompi-2022.05/lib/libhdf5_hl.so /software/HDF5/1.13.1-gompi-2022.05/lib/libhdf5.so -lm /software/zlib/1.2.12-GCCcore-11.3.0/lib/libz.so /software/zstd/1.5.2-GCCcore-11.3.0/lib/libzstd.so /software/bzip2/1.0.8-GCCcore-11.3.0/lib/libbz2.so /software/Szip/2.1.1-GCCcore-11.3.0/lib/libsz.so /software/cURL/7.83.0-GCCcore-11.3.0/lib/libcurl.so /software/libxml2/2.9.13-GCCcore-11.3.0/lib/libxml2.so ../liblib/libnetcdf.a -ldl /software/HDF5/1.13.1-gompi-2022.05/lib/libhdf5_hl.so /software/HDF5/1.13.1-gompi-2022.05/lib/libhdf5.so -lm /software/zlib/1.2.12-GCCcore-11.3.0/lib/libz.so /software/zstd/1.5.2-GCCcore-11.3.0/lib/libzstd.so /software/bzip2/1.0.8-GCCcore-11.3.0/lib/libbz2.so /software/Szip/2.1.1-GCCcore-11.3.0/lib/libsz.so /software/cURL/7.83.0-GCCcore-11.3.0/lib/libcurl.so /software/libxml2/2.9.13-GCCcore-11.3.0/lib/libxml2.so
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: error: ../liblib/libnetcdf.a(ncjson.c.o): multiple definition of 'NCJreclaim'
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: CMakeFiles/nczhdf5filters.dir/NCZhdf5filters.c.o: previous definition here
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: error: ../liblib/libnetcdf.a(ncjson.c.o): multiple definition of 'NCJnew'
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: CMakeFiles/nczhdf5filters.dir/NCZhdf5filters.c.o: previous definition here
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: error: ../liblib/libnetcdf.a(ncjson.c.o): multiple definition of 'NCJnewstringn'
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: CMakeFiles/nczhdf5filters.dir/NCZhdf5filters.c.o: previous definition here
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: error: ../liblib/libnetcdf.a(ncjson.c.o): multiple definition of 'NCJnewstring'
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: CMakeFiles/nczhdf5filters.dir/NCZhdf5filters.c.o: previous definition here
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: error: ../liblib/libnetcdf.a(ncjson.c.o): multiple definition of 'NCJparse'
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: CMakeFiles/nczhdf5filters.dir/NCZhdf5filters.c.o: previous definition here
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: error: ../liblib/libnetcdf.a(ncjson.c.o): multiple definition of 'NCJdictget'
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: CMakeFiles/nczhdf5filters.dir/NCZhdf5filters.c.o: previous definition here
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: error: ../liblib/libnetcdf.a(ncjson.c.o): multiple definition of 'NCJcvt'
/software/binutils/2.38-GCCcore-11.3.0/bin/ld: CMakeFiles/nczhdf5filters.dir/NCZhdf5filters.c.o: previous definition here
collect2: error: ld returned 1 exit status
make[2]: *** [plugins/CMakeFiles/nczhdf5filters.dir/build.make:119: plugins/lib__nczhdf5filters.so] Error 1

To reproduce:

cmake -DCMAKE_INSTALL_PREFIX=/software/netCDF/4.9.0-gompi-2022.05 -DCMAKE_BUILD_TYPE=Release 
  -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DCMAKE_C_COMPILER='mpicc'
  -DCMAKE_C_FLAGS='-O2 -ftree-vectorize -march=native -fno-math-errno -fPIC' -DCMAKE_CXX_COMPILER='mpicxx' 
  -DCMAKE_CXX_FLAGS='-O2 -ftree-vectorize -march=native -fno-math-errno -fPIC'
  -DCMAKE_Fortran_COMPILER='mpifort' -DCMAKE_Fortran_FLAGS='-O2 -ftree-vectorize -march=native -fno-math-errno -fPIC'
  -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_FIND_USE_PACKAGE_REGISTRY=FALSE -DBUILD_SHARED_LIBS=OFF 
  -DCURL_INCLUDE_DIR=/software/cURL/7.83.0-GCCcore-11.3.0/include   -DCURL_LIBRARY=/software/cURL/7.83.0-GCCcore-11.3.0/lib/libcurl.so
  -DHDF5_INCLUDE_DIR=/software/HDF5/1.13.1-gompi-2022.05/include   -DUSE_HDF5=ON
  -DHDF5_C_LIBRARY=/software/HDF5/1.13.1-gompi-2022.05/lib/libhdf5.so   -DHDF5_HL_LIBRARY=/software/HDF5/1.13.1-gompi-2022.05/lib/libhdf5_hl.so
  -DSZIP_INCLUDE_DIR=/software/Szip/2.1.1-GCCcore-11.3.0/include   -DSZIP_LIBRARY=/software/Szip/2.1.1-GCCcore-11.3.0/lib/libsz.so
  -DZLIB_INCLUDE_DIR=/software/zlib/1.2.12-GCCcore-11.3.0/include   -DZLIB_LIBRARY=/software/zlib/1.2.12-GCCcore-11.3.0/lib/libz.so
  /tmp/easybuild_build/netCDF/4.9.0/gompi-2022.05/netcdf-c-4.9.0/

make

We are not seeing this problem with netCDF v4.8.1 using the exact same compiler toolchain (GCC 11.3.0 + binutils 2.38) and CMake (3.23.1) on RHEL 8.4...

boegel avatar Jun 22 '22 12:06 boegel

There seems to be two distinct implementations of the JSON bits in the codebase, one in pair of separate libdispatch/ncjson.c and include/ncjson.h; and one mechanically combined into a header with definitions directly in the header without any inlining at all in include/netcdf_json.h.

The purpose of netcdf_json.h seems to be to have a decoupled implementation of the JSON functionality for use in plugins, while the separated core bits goes into libdispatch which ends up in the netcdf library.

Here nczhdf5filters and nczstdfilters plugins include the standalone implementation header, but they also link netcdf which due to how static linking works is treating the symbols from the static netcdf library with the same importance as any other object file in the plugin.

This linking problem is masked when linking against shared libraries as the single-header implementation in the object files is preferred over the implementation in the shared library. Amusingly enough, depending on executable/library load order, the shared library may end up using a symbol defined in the executable or a previously loaded library rather than its own symbol.

One workaround that seems to do the trick for my non-netcdf test case is to mark the function declarations and definitions in the single-file implementation netcdf_json.h as static, giving them internal linkage to each source file it's included in.

zao avatar Jun 22 '22 17:06 zao

Your json analysis is correct. I wanted it to be possible to build the nczarr adjunct wrapper code so that it did not need access to libnetcdf. Also, I wanted to keep only one core json code base so I would not have to apply fixes twice. Not surprisingly, I tried to be too clever by half, and it does not work right. Using "static" works, but make the construction of the netcdf_json.h file more difficult. I suppose I could just give up and require that nczarr wrappers must use libnetcdf. Let me think about this a bit.

DennisHeimbigner avatar Jun 22 '22 19:06 DennisHeimbigner

We puzzled together a patch that allows us to bypass this problem, by adding static in netcdf_json.h; see https://github.com/easybuilders/easybuild-easyconfigs/blob/develop/easybuild/easyconfigs/n/netCDF/netCDF-4.9.0_fix-linking-errors.patch

boegel avatar Jun 23 '22 15:06 boegel

Yes,I am following a similar path by using a macro that gets redefined depending on the context. Hope to have a PR up soon.

DennisHeimbigner avatar Jun 23 '22 18:06 DennisHeimbigner

See PR https://github.com/Unidata/netcdf-c/pull/2448

DennisHeimbigner avatar Jul 06 '22 23:07 DennisHeimbigner