glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

Build failure in armhf and others: undefined reference to `_uatomic_link_error'

Open panlinux opened this issue 3 years ago • 21 comments

Description of problem: Glusterfs 10.0 is failing to build in Ubuntu[1][2] on armhf, and on Debian it fails to build on armel[2], mipsel[3], and armhf[4].

The exact command to reproduce the issue: The logs[2] show:

libtool: link: gcc -Wall -I/usr/include/uuid -I/usr/include/tirpc -Wformat -Werror=format-security -Werror=implicit-function-declaration -flto -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -rdynamic -flto -Wl,-Bsymbolic-functions -Wl,-z -Wl,relro -Wl,-z -Wl,now -o .libs/glusterfsd glusterfsd.o glusterfsd-mgmt.o -ltirpc ../../libglusterfs/src/.libs/libglusterfs.so ../../rpc/rpc-lib/src/.libs/libgfrpc.so ../../rpc/xdr/src/.libs/libgfxdr.so -lm -ldl -lrt -lpthread -lcrypto
/usr/bin/ld: ../../libglusterfs/src/.libs/libglusterfs.so: undefined reference to `_uatomic_link_error'

Expected results: Build should work, as it did in previous versions, on these architectures.

Mandatory info: - The output of the gluster volume info command: not related

- The output of the gluster volume status command: not related

- The output of the gluster volume heal command: not related

- Provide logs present on following locations of client and server nodes Build log is at [2].

- Is there any crash ? Provide the backtrace and coredump No crash

Additional info:

- The operating system / glusterfs version:

  • Ubuntu Jammy 22.04 (development release)
  • glusterfs 10.0-1.2, same as the debian package (it's a sync)
  1. https://bugs.launchpad.net/ubuntu/+source/glusterfs/+bug/1951408
  2. https://launchpadlibrarian.net/569426150/buildlog_ubuntu-jammy-armhf.glusterfs_10.0-1.2_BUILDING.txt.gz
  3. https://buildd.debian.org/status/fetch.php?pkg=glusterfs&arch=armel&ver=10.0-1.2&stamp=1637143838&raw=0
  4. https://buildd.debian.org/status/fetch.php?pkg=glusterfs&arch=mipsel&ver=10.0-1.2&stamp=1637144549&raw=0
  5. https://buildd.debian.org/status/fetch.php?pkg=glusterfs&arch=armhf&ver=10.0-1.2&stamp=1637143701&raw=0

panlinux avatar Nov 23 '21 14:11 panlinux

Could be a liburcu issue? See https://www.mail-archive.com/[email protected]/msg1831153.html

mykaul avatar Nov 23 '21 15:11 mykaul

tl;dnr: add -DUATOMIC_NO_LINK_ERROR to CFLAGS on armv7hl.

more details at https://www.mail-archive.com/[email protected]/msg12950.html

kalebskeithley avatar Nov 23 '21 15:11 kalebskeithley

Yeah, that works. I'll propose it to Debian as well.

panlinux avatar Nov 26 '21 17:11 panlinux

I am also trying to compile 10.1 on ARM32, and am hitting the issue... how did you solve this in the end?

I edited the configure file, and added it on to CFLAGS="-g -02" in one of the lines... is it necessary to change every line of CFLAGS= to append the -DUATOMIC_NO_LINK_ERROR to all of them in 'configure'... or is there another method of adding this switch?

kevinpawsey avatar Feb 08 '22 20:02 kevinpawsey

In the case of debian packaging, it's solved in debian/rules:

# Fix build on these arches (LP: #1951408) (#1000215)
ifneq (,$(filter $(DEB_HOST_ARCH), armel armhf mipsel))
export DEB_CPPFLAGS_MAINT_APPEND = -DUATOMIC_NO_LINK_ERROR
endif

I think you can just export CFLAGS with that value before running ./configure

panlinux avatar Feb 08 '22 20:02 panlinux

I see that you can do ./configure CFLAGS=-DUATOMIC_NO_LINK_ERROR but that seems to overwrite any CFLAGS that are created in the Makefile (previously it was "-g -02") the issue was that it seemed to complete compiling ok, and then I went to run glusterd, and it core dumped... looking at the core dump it errored on: Core was generated by `glusterd -N'. Program terminated with signal SIGILL, Illegal instruction. #0 _uatomic_link_error () at /usr/include/arm-linux-gnueabihf/urcu/uatomic/generic.h:51 it appeared to be related... but not 100% sure without -DUATOMIC_NO_LINK_ERROR the compile fails with:

  CC       glusterfsd.o
  CC       glusterfsd-mgmt.o
  CCLD     glusterfsd
/usr/bin/ld: ../../libglusterfs/src/.libs/libglusterfs.so: undefined reference to `_uatomic_link_error'
collect2: error: ld returned 1 exit status
make[3]: *** [Makefile:549: glusterfsd] Error 1
make[3]: Leaving directory '/root/glusterfs/glusterfsd/src'
make[2]: *** [Makefile:463: all-recursive] Error 1
make[2]: Leaving directory '/root/glusterfs/glusterfsd'
make[1]: *** [Makefile:595: all-recursive] Error 1
make[1]: Leaving directory '/root/glusterfs'
make: *** [Makefile:490: all] Error 2```

kevinpawsey avatar Feb 08 '22 21:02 kevinpawsey

@panlinux - I have just noticed in your configure command line you have excluded io_uring and tcmalloc from the build.. is there a reason for this, as I am including them in my build for ARM32

kevinpawsey avatar Feb 09 '22 11:02 kevinpawsey

That's the Debian packaging, which we follow in Ubuntu. I debated whether to enable uring, but when I checked a bit of history, it seems it's new in glusterfs, and the next ubuntu release is an LTS, so we thought best to not enable it for this particular release.

Regarding tcmalloc, it's because of these bugs: Disable tcmalloc due to problems with dlopen (LP: #1950777 #1951126 Debian: #999700 #999619)

panlinux avatar Feb 09 '22 12:02 panlinux

ah, thank you for that @panlinux .... I have just tried to build in debian bullseye 10.1 and disabled tcmalloc and io_uring to see if I can at least get a working compile... if that works I may re-introduce io_uring to see if it compiles and works ok. To get it to compile at all I have had to add the -DUATOMIC_NO_LINK_ERROR to the CFLAGS, my latest './configure' looks like: ./configure --disable-linux-io_uring --enable-firewalld --without-tcmalloc CFLAGS="-g -O2 -DUATOMIC_NO_LINK_ERROR"

Update: it is still failing at the same point in the code, despite adding the '-DUATOMIC_NO_LINK_ERROR'

│   40          #if !defined __OPTIMIZE__  || defined UATOMIC_NO_LINK_ERROR                                                                                                                                │
│   41          static inline __attribute__((always_inline))                                                                                                                                               │
│   42          void _uatomic_link_error(void)                                                                                                                                                             │
│   43          {                                                                                                                                                                                          │
│   44          #ifdef ILLEGAL_INSTR                                                                                                                                                                       │
│   45                  /*                                                                                                                                                                                 │
│   46                   * generate an illegal instruction. Cannot catch this with                                                                                                                         │
│   47                   * linker tricks when optimizations are disabled.                                                                                                                                  │
│   48                   */                                                                                                                                                                                │
│   49                  __asm__ __volatile__(ILLEGAL_INSTR);                                                                                                                                               │
│   50          #else                                                                                                                                                                                      │
│   51                  __builtin_trap();                                                                                                                                                                  │
│   52          #endif                                                                                                                                                                                     │
│   53          }                                                                                                                                                                                          │

specifically line 51

kevinpawsey avatar Feb 09 '22 12:02 kevinpawsey

It's an environment variable, define it before the call, i.e., CFLAGS=... ./configure (or CPPFLAGS=... ./configure)

panlinux avatar Feb 09 '22 13:02 panlinux

ah, ok... do I need to do it in CPPFLAGS as well, or was that just an example?

if I look at the Makefile, I can see that CFLAGS has the switch added:

root@gluster-05:~/glusterfs# grep CFLAGS Makefile
CFLAGS = -g -O2 -DUATOMIC_NO_LINK_ERROR

whereas CPPFLAGS does not:

root@gluster-05:~/glusterfs# grep CPPFLAGS Makefile
CPPFLAGS = 
GF_CPPFLAGS =  -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D$(GF_HOST_OS) -include $(top_builddir)/config.h -include $(top_builddir)/site.h -I$(top_srcdir)/libglusterfs/src -I$(top_builddir)/libglusterfs/src

EDIT: I have checked the other Makefiles in the subdirectories, and these too have the -DUATOMIC_NO_LINK_ERROR switch on CFLAGS in the Makefile

kevinpawsey avatar Feb 09 '22 13:02 kevinpawsey

Just an example. See here for an armhf build log on ubuntu: https://launchpadlibrarian.net/582144161/buildlog_ubuntu-jammy-armhf.glusterfs_10.1-1_BUILDING.txt.gz

I think this is getting out of scope for this ticket now ;)

panlinux avatar Feb 09 '22 13:02 panlinux

that example doesn't show the CFLAGS being set, but I do see that the -DUATOMIC_NO_LINK_ERROR option is used in the build process.

it's sort of relevant to the ticket, as I am having the same error, but I cannot get it to build a working executable simply appending the -DUATOMIC_NO_LINK_ERROR option to CFLAGS... unless I am fundamentally missing something.

I can try another build using the command line of:

CFLAGS="-g -O2 -DUATOMIC_NO_LINK_ERROR" ./configure --disable-linux-io_uring --enable-firewalld --without-tcmalloc

and see if I have any more success

EDIT: when running the make install, I can see that the -DUATOMIC_NO_LINK_ERROR is being used

libtool: install: (cd /root/glusterfs/xlators/cluster/dht/src; /bin/bash "/root/glusterfs/libtool"  --silent --tag CC --mode=relink gcc -Wall -I/usr/include/uuid -I/usr/include/tirpc -Wformat -Werror=format-security -Werror=implicit-function-declaration -flto -g -O2 -DUATOMIC_NO_LINK_ERROR -module -avoid-version -export-symbols ../../../../xlators/xlator.sym -luuid -Wl,--no-undefined -ltirpc -o dht.la -rpath /usr/local/lib/glusterfs/10.1/xlator/cluster dht-layout.lo dht-helper.lo dht-linkfile.lo dht-rebalance.lo dht-selfheal.lo dht-rename.lo dht-hashfn.lo dht-diskusage.lo dht-common.lo dht-inode-write.lo dht-inode-read.lo dht-shared.lo dht-lock.lo libxlator.lo dht.lo ../../../../libglusterfs/src/libglusterfs.la -lrt -ldl -lpthread -lcrypto )

kevinpawsey avatar Feb 09 '22 13:02 kevinpawsey

We have checked this in the slack channel and the problem is that the I/O framework introduced in release 10 uses real 64 bit atomic variables. In 32 bits architectures this doesn't work:

(gdb) bt
#0  _uatomic_link_error () at /usr/include/arm-linux-gnueabihf/urcu/uatomic/generic.h:51
#1  _uatomic_add_return (len=8, val=1, addr=0xb6f607b8 <gf_io+56>) at /usr/include/arm-linux-gnueabihf/urcu/uatomic/generic.h:201
#2  gf_io_reserve (nr=1) at ./glusterfs/gf-io.h:361
#3  gf_io_callback (data=0x0, cbk=<optimized out>) at ./glusterfs/gf-io.h:555
#4  gf_io_workers_stop () at gf-io.c:302
#5  gf_io_main (handlers=<optimized out>, data=<optimized out>, workers=<optimized out>) at gf-io.c:444
#6  gf_io_run (handlers=<optimized out>, data=<optimized out>, name=<optimized out>) at gf-io.c:521
#7  gf_io_run (name=<optimized out>, handlers=handlers@entry=0xbe8bf538, data=data@entry=0x0) at gf-io.c:493
#8  0x00494ae0 in main (argc=<optimized out>, argv=0xbe8c06d4) at glusterfsd.c:2790

gf_io_reserve() atomically updates a 64-bits counter.

@amarts @pranithk what could we do here ?

Using GF_ATOMIC macros would use a mutex in this case to simulate the atomic operation, but it will surely kill all the benefits of the I/O framework and create lock contention. Reducing the variable to 32 bits may cause it to overflow too fast in some cases.

xhernandez avatar Feb 10 '22 17:02 xhernandez

I thought we've decided not to support 32bit platforms already in previous releases. We don't have good CI for it and clearly it breaks.

mykaul avatar Feb 10 '22 18:02 mykaul

We decided to not support 32 bits (i.e. no testing is done) but we do keep compiling it to make sure that at least compiles and should work. The problem is that this issue is not detected during compilation.

In this case newer versions won't work on 32 bits platforms even if they compile. We need to decide if we still want to compile for 32 bits and then modify it in someway to make it work, or completely remove the support even for compilation (and remove the smoke tests). A third option is that someone volunteers to fix it on 32 bit platforms.

xhernandez avatar Feb 11 '22 10:02 xhernandez

if there is anything that I can do to assist in this, I am happy to help test builds, etc

kevinpawsey avatar Feb 14 '22 10:02 kevinpawsey

In this particular case it's necessary to change how things are implemented. It's not just a build configuration. Just changing the value to 32 bits is not enough because it could overflow too fast (it's used to uniquely identify ongoing requests).

xhernandez avatar Feb 15 '22 11:02 xhernandez

yes, understood that ... if it is a case that it is not going to be supported any more, then it is reasonable for that reason... but if you do need anything testing, I am just saying that I am happy to compile/test

kevinpawsey avatar Feb 15 '22 13:02 kevinpawsey

Thanks @kevinpawsey. I'll take it into account when something needs to be tested in 32 bits architectures

xhernandez avatar Feb 16 '22 11:02 xhernandez

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] avatar Sep 21 '22 00:09 stale[bot]

Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it.

stale[bot] avatar Nov 01 '22 21:11 stale[bot]

I have used the recommendations of this issue to compile GlusterFS 10.4 on Raspbian but I can't start glusterd. Is this a known issue? I have opened the following issue to report back: https://github.com/gluster/glusterfs/issues/4177

hostingnuggets avatar Jun 27 '23 05:06 hostingnuggets