glusterfs
glusterfs copied to clipboard
Build failure in armhf and others: undefined reference to `_uatomic_link_error'
Description of problem: Glusterfs 10.0 is failing to build in Ubuntu[1][2] on armhf, and on Debian it fails to build on armel[2], mipsel[3], and armhf[4].
The exact command to reproduce the issue: The logs[2] show:
libtool: link: gcc -Wall -I/usr/include/uuid -I/usr/include/tirpc -Wformat -Werror=format-security -Werror=implicit-function-declaration -flto -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -rdynamic -flto -Wl,-Bsymbolic-functions -Wl,-z -Wl,relro -Wl,-z -Wl,now -o .libs/glusterfsd glusterfsd.o glusterfsd-mgmt.o -ltirpc ../../libglusterfs/src/.libs/libglusterfs.so ../../rpc/rpc-lib/src/.libs/libgfrpc.so ../../rpc/xdr/src/.libs/libgfxdr.so -lm -ldl -lrt -lpthread -lcrypto
/usr/bin/ld: ../../libglusterfs/src/.libs/libglusterfs.so: undefined reference to `_uatomic_link_error'
Expected results: Build should work, as it did in previous versions, on these architectures.
Mandatory info:
- The output of the gluster volume info
command:
not related
- The output of the gluster volume status
command:
not related
- The output of the gluster volume heal
command:
not related
- Provide logs present on following locations of client and server nodes Build log is at [2].
- Is there any crash ? Provide the backtrace and coredump No crash
Additional info:
- The operating system / glusterfs version:
- Ubuntu Jammy 22.04 (development release)
- glusterfs 10.0-1.2, same as the debian package (it's a sync)
- https://bugs.launchpad.net/ubuntu/+source/glusterfs/+bug/1951408
- https://launchpadlibrarian.net/569426150/buildlog_ubuntu-jammy-armhf.glusterfs_10.0-1.2_BUILDING.txt.gz
- https://buildd.debian.org/status/fetch.php?pkg=glusterfs&arch=armel&ver=10.0-1.2&stamp=1637143838&raw=0
- https://buildd.debian.org/status/fetch.php?pkg=glusterfs&arch=mipsel&ver=10.0-1.2&stamp=1637144549&raw=0
- https://buildd.debian.org/status/fetch.php?pkg=glusterfs&arch=armhf&ver=10.0-1.2&stamp=1637143701&raw=0
Could be a liburcu issue? See https://www.mail-archive.com/[email protected]/msg1831153.html
tl;dnr: add -DUATOMIC_NO_LINK_ERROR to CFLAGS on armv7hl.
more details at https://www.mail-archive.com/[email protected]/msg12950.html
Yeah, that works. I'll propose it to Debian as well.
I am also trying to compile 10.1 on ARM32, and am hitting the issue... how did you solve this in the end?
I edited the configure file, and added it on to CFLAGS="-g -02" in one of the lines... is it necessary to change every line of CFLAGS= to append the -DUATOMIC_NO_LINK_ERROR to all of them in 'configure'... or is there another method of adding this switch?
In the case of debian packaging, it's solved in debian/rules
:
# Fix build on these arches (LP: #1951408) (#1000215)
ifneq (,$(filter $(DEB_HOST_ARCH), armel armhf mipsel))
export DEB_CPPFLAGS_MAINT_APPEND = -DUATOMIC_NO_LINK_ERROR
endif
I think you can just export CFLAGS
with that value before running ./configure
I see that you can do
./configure CFLAGS=-DUATOMIC_NO_LINK_ERROR
but that seems to overwrite any CFLAGS that are created in the Makefile (previously it was "-g -02")
the issue was that it seemed to complete compiling ok, and then I went to run glusterd, and it core dumped... looking at the core dump it errored on:
Core was generated by `glusterd -N'. Program terminated with signal SIGILL, Illegal instruction. #0 _uatomic_link_error () at /usr/include/arm-linux-gnueabihf/urcu/uatomic/generic.h:51
it appeared to be related... but not 100% sure
without -DUATOMIC_NO_LINK_ERROR the compile fails with:
CC glusterfsd.o
CC glusterfsd-mgmt.o
CCLD glusterfsd
/usr/bin/ld: ../../libglusterfs/src/.libs/libglusterfs.so: undefined reference to `_uatomic_link_error'
collect2: error: ld returned 1 exit status
make[3]: *** [Makefile:549: glusterfsd] Error 1
make[3]: Leaving directory '/root/glusterfs/glusterfsd/src'
make[2]: *** [Makefile:463: all-recursive] Error 1
make[2]: Leaving directory '/root/glusterfs/glusterfsd'
make[1]: *** [Makefile:595: all-recursive] Error 1
make[1]: Leaving directory '/root/glusterfs'
make: *** [Makefile:490: all] Error 2```
@panlinux - I have just noticed in your configure
command line you have excluded io_uring
and tcmalloc
from the build.. is there a reason for this, as I am including them in my build for ARM32
That's the Debian packaging, which we follow in Ubuntu. I debated whether to enable uring, but when I checked a bit of history, it seems it's new in glusterfs, and the next ubuntu release is an LTS, so we thought best to not enable it for this particular release.
Regarding tcmalloc, it's because of these bugs:
Disable tcmalloc due to problems with dlopen (LP: #1950777 #1951126 Debian: #999700 #999619)
ah, thank you for that @panlinux .... I have just tried to build in debian bullseye 10.1 and disabled tcmalloc and io_uring to see if I can at least get a working compile... if that works I may re-introduce io_uring to see if it compiles and works ok. To get it to compile at all I have had to add the -DUATOMIC_NO_LINK_ERROR to the CFLAGS, my latest './configure' looks like:
./configure --disable-linux-io_uring --enable-firewalld --without-tcmalloc CFLAGS="-g -O2 -DUATOMIC_NO_LINK_ERROR"
Update: it is still failing at the same point in the code, despite adding the '-DUATOMIC_NO_LINK_ERROR'
│ 40 #if !defined __OPTIMIZE__ || defined UATOMIC_NO_LINK_ERROR │
│ 41 static inline __attribute__((always_inline)) │
│ 42 void _uatomic_link_error(void) │
│ 43 { │
│ 44 #ifdef ILLEGAL_INSTR │
│ 45 /* │
│ 46 * generate an illegal instruction. Cannot catch this with │
│ 47 * linker tricks when optimizations are disabled. │
│ 48 */ │
│ 49 __asm__ __volatile__(ILLEGAL_INSTR); │
│ 50 #else │
│ 51 __builtin_trap(); │
│ 52 #endif │
│ 53 } │
specifically line 51
It's an environment variable, define it before the call, i.e., CFLAGS=... ./configure
(or CPPFLAGS=... ./configure
)
ah, ok... do I need to do it in CPPFLAGS as well, or was that just an example?
if I look at the Makefile, I can see that CFLAGS has the switch added:
root@gluster-05:~/glusterfs# grep CFLAGS Makefile
CFLAGS = -g -O2 -DUATOMIC_NO_LINK_ERROR
whereas CPPFLAGS does not:
root@gluster-05:~/glusterfs# grep CPPFLAGS Makefile
CPPFLAGS =
GF_CPPFLAGS = -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D$(GF_HOST_OS) -include $(top_builddir)/config.h -include $(top_builddir)/site.h -I$(top_srcdir)/libglusterfs/src -I$(top_builddir)/libglusterfs/src
EDIT: I have checked the other Makefiles in the subdirectories, and these too have the -DUATOMIC_NO_LINK_ERROR switch on CFLAGS in the Makefile
Just an example. See here for an armhf build log on ubuntu: https://launchpadlibrarian.net/582144161/buildlog_ubuntu-jammy-armhf.glusterfs_10.1-1_BUILDING.txt.gz
I think this is getting out of scope for this ticket now ;)
that example doesn't show the CFLAGS being set, but I do see that the -DUATOMIC_NO_LINK_ERROR option is used in the build process.
it's sort of relevant to the ticket, as I am having the same error, but I cannot get it to build a working executable simply appending the -DUATOMIC_NO_LINK_ERROR option to CFLAGS... unless I am fundamentally missing something.
I can try another build using the command line of:
CFLAGS="-g -O2 -DUATOMIC_NO_LINK_ERROR" ./configure --disable-linux-io_uring --enable-firewalld --without-tcmalloc
and see if I have any more success
EDIT: when running the make install
, I can see that the -DUATOMIC_NO_LINK_ERROR is being used
libtool: install: (cd /root/glusterfs/xlators/cluster/dht/src; /bin/bash "/root/glusterfs/libtool" --silent --tag CC --mode=relink gcc -Wall -I/usr/include/uuid -I/usr/include/tirpc -Wformat -Werror=format-security -Werror=implicit-function-declaration -flto -g -O2 -DUATOMIC_NO_LINK_ERROR -module -avoid-version -export-symbols ../../../../xlators/xlator.sym -luuid -Wl,--no-undefined -ltirpc -o dht.la -rpath /usr/local/lib/glusterfs/10.1/xlator/cluster dht-layout.lo dht-helper.lo dht-linkfile.lo dht-rebalance.lo dht-selfheal.lo dht-rename.lo dht-hashfn.lo dht-diskusage.lo dht-common.lo dht-inode-write.lo dht-inode-read.lo dht-shared.lo dht-lock.lo libxlator.lo dht.lo ../../../../libglusterfs/src/libglusterfs.la -lrt -ldl -lpthread -lcrypto )
We have checked this in the slack channel and the problem is that the I/O framework introduced in release 10 uses real 64 bit atomic variables. In 32 bits architectures this doesn't work:
(gdb) bt
#0 _uatomic_link_error () at /usr/include/arm-linux-gnueabihf/urcu/uatomic/generic.h:51
#1 _uatomic_add_return (len=8, val=1, addr=0xb6f607b8 <gf_io+56>) at /usr/include/arm-linux-gnueabihf/urcu/uatomic/generic.h:201
#2 gf_io_reserve (nr=1) at ./glusterfs/gf-io.h:361
#3 gf_io_callback (data=0x0, cbk=<optimized out>) at ./glusterfs/gf-io.h:555
#4 gf_io_workers_stop () at gf-io.c:302
#5 gf_io_main (handlers=<optimized out>, data=<optimized out>, workers=<optimized out>) at gf-io.c:444
#6 gf_io_run (handlers=<optimized out>, data=<optimized out>, name=<optimized out>) at gf-io.c:521
#7 gf_io_run (name=<optimized out>, handlers=handlers@entry=0xbe8bf538, data=data@entry=0x0) at gf-io.c:493
#8 0x00494ae0 in main (argc=<optimized out>, argv=0xbe8c06d4) at glusterfsd.c:2790
gf_io_reserve()
atomically updates a 64-bits counter.
@amarts @pranithk what could we do here ?
Using GF_ATOMIC
macros would use a mutex in this case to simulate the atomic operation, but it will surely kill all the benefits of the I/O framework and create lock contention. Reducing the variable to 32 bits may cause it to overflow too fast in some cases.
I thought we've decided not to support 32bit platforms already in previous releases. We don't have good CI for it and clearly it breaks.
We decided to not support 32 bits (i.e. no testing is done) but we do keep compiling it to make sure that at least compiles and should work. The problem is that this issue is not detected during compilation.
In this case newer versions won't work on 32 bits platforms even if they compile. We need to decide if we still want to compile for 32 bits and then modify it in someway to make it work, or completely remove the support even for compilation (and remove the smoke tests). A third option is that someone volunteers to fix it on 32 bit platforms.
if there is anything that I can do to assist in this, I am happy to help test builds, etc
In this particular case it's necessary to change how things are implemented. It's not just a build configuration. Just changing the value to 32 bits is not enough because it could overflow too fast (it's used to uniquely identify ongoing requests).
yes, understood that ... if it is a case that it is not going to be supported any more, then it is reasonable for that reason... but if you do need anything testing, I am just saying that I am happy to compile/test
Thanks @kevinpawsey. I'll take it into account when something needs to be tested in 32 bits architectures
Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.
Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it.
I have used the recommendations of this issue to compile GlusterFS 10.4 on Raspbian but I can't start glusterd. Is this a known issue? I have opened the following issue to report back: https://github.com/gluster/glusterfs/issues/4177