libtorrent
libtorrent copied to clipboard
libtorrent.so 0.13.8 crash on Rasberry PI4 kernel due to unaligned access
libtorrent 0.13.8, distributed with rtorrent 0.9.8 raspberry PI4 kernel version: 6.1.21-v8+ aarch64 GNU/Linux
rtorrent runs on a Raspberry PI 4 with storage configured to ext4 HDD. was running fine for ages, I've updated the PI4 to the latest update yesterday and now rtorrent crashes after downloading some data from a torrent. restarting the process makes it crash the sameway after the file chcksum has been completed
Caught SIGBUS, dumping stack:
rtorrent() [0x1fdf0]
/lib/arm-linux-gnueabihf/libc.so.6(__default_rt_sa_restorer+0) [0xf7493910]
/usr/lib/arm-linux-gnueabihf/libtorrent.so.21(+0xd3af4) [0xf792aaf4]
/usr/lib/arm-linux-gnueabihf/libtorrent.so.21(_ZN7torrent9PollEPoll7performEv+0xe0) [0xf7894984]
/usr/lib/arm-linux-gnueabihf/libtorrent.so.21(_ZN7torrent9PollEPoll7do_pollExi+0x78) [0xf7894adc]
/usr/lib/arm-linux-gnueabihf/libtorrent.so.21(_ZN7torrent11thread_base10event_loopEPS0_+0x174) [0xf78d189c]
rtorrent() [0x1e660]
/lib/arm-linux-gnueabihf/libc.so.6(__libc_start_main+0x114) [0xf747b740]
Error: Success
Signal code '1': Invalid address alignment.
Fault address: 0x110946d
The fault address is not part of any chunk.
using gdb gets the following information that shows that the issue is indeed in Thread 1 running libtorrent.so
(gdb) thread apply all backtrace
Thread 3 (Thread 0xf5dff200 (LWP 1536) "rtorrent scgi"):
#0 0xf7a7a1dc in epoll_wait (epfd=13, events=0x1f6750, maxevents=1024, timeout=600001) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1 0xf7dce898 in torrent::PollEPoll::poll(int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#2 0xf7dceac8 in torrent::PollEPoll::do_poll(long long, int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#3 0xf7e0b89c in torrent::thread_base::event_loop(torrent::thread_base*) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#4 0xf7af8310 in start_thread (arg=0xf5dff200) at pthread_create.c:477
#5 0xf7a79da8 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:73 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thread 2 (Thread 0xf6770200 (LWP 1535) "rtorrent disk"):
#0 0xf7a7a1dc in epoll_wait (epfd=10, events=0x1f01b8, maxevents=1024, timeout=10001) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1 0xf7dce898 in torrent::PollEPoll::poll(int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#2 0xf7dceac8 in torrent::PollEPoll::do_poll(long long, int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#3 0xf7e0b89c in torrent::thread_base::event_loop(torrent::thread_base*) () fro--Type <RET> for more, q to quit, c to continue without paging--
m /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#4 0xf7af8310 in start_thread (arg=0xf6770200) at pthread_create.c:477
#5 0xf7a79da8 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:73 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Thread 1 (Thread 0xf6c2b040 (LWP 1531) "rtorrent main"):
#0 0xf7e64af4 in ?? () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#1 0xf7dce984 in torrent::PollEPoll::perform() () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#2 0xf7dceadc in torrent::PollEPoll::do_poll(long long, int) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#3 0xf7e0b89c in torrent::thread_base::event_loop(torrent::thread_base*) () from /usr/lib/arm-linux-gnueabihf/libtorrent.so.21
#4 0x0001e660 in ?? ()
#5 0xf79b5740 in __libc_start_main (main=0xfffef6e4, argc=-139534336, argv=0xf79b5740 <__libc_start_main+276>, init=<optimized out>, fini=0x170f98, rtld_fini=0xf7fcd510 <_dl_fini>, stack_end=0xfffef6e4) at libc-start.c:308
#6 0x0001f2c0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
awaiting instructions on how to provide you further debug info. thanks
kernel alignment options
sudo modprobe configs
zgrep ALIGN /proc/config.gz
# CONFIG_COMPAT_ALIGNMENT_FIXUPS is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
CONFIG_CMA_ALIGNMENT=8
# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set
some debug thoughts taken from https://github.com/epics-base/pvDataCPP/issues/84
I've found a workaround:
- build the source code of rakshasa/libtorrent with
configureplus option--enable-aligned - of course I had to manually build the rtorrent binary as well against the previous library
- I also had to manually install: libssl-dev, libncurses5-dev, libncursesw5-dev, autoconf-archive, libcurl-dev.
I can revert to the old binary anytime if you still need me to help with debug the unaligned issue.
Thanks so much for reporting this information. I would like to add information here. This also happens on x86 - not just ARM. I can confirm that configuring rakshasa/libtorrent with --enable-aligned resolves the problem.
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0 __memmove_avx_unaligned () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:222
#1 0x00007ffff7dfd269 in torrent::Chunk::to_buffer(void*, unsigned int, unsigned int) () from /lib/libtorrent.so.21
#2 0x00007ffff7e2ad0c in torrent::PeerConnectionBase::up_chunk() () from /lib/libtorrent.so.21
#3 0x00007ffff7e2eb3a in torrent::PeerConnection<(torrent::Download::ConnectionType)1>::event_write() ()
from /lib/libtorrent.so.21
#4 0x00007ffff7dd0848 in torrent::PollEPoll::perform() () from /lib/libtorrent.so.21
#5 0x00007ffff7df3b62 in torrent::thread_base::event_loop(torrent::thread_base*) () from /lib/libtorrent.so.21
#6 0x000055555558e8c4 in main ()
I just saw this error:
../rak/socket_address.h:131:72: error: cast from 'sockaddr*' to 'rak::socket_address*' increases required alignment of target type [-Werror=cast-align]
131 | static socket_address* cast_from(sockaddr* sa) { return reinterpret_cast<socket_address*>(sa); }
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../rak/socket_address.h: In static member function 'static const rak::socket_address* rak::socket_address::cast_from(const sockaddr*)':
../rak/socket_address.h:132:72: error: cast from 'const sockaddr*' to 'const rak::socket_address*' increases required alignment of target type [-Werror=cast-align]
132 | static const socket_address* cast_from(const sockaddr* sa) { return reinterpret_cast<const socket_address*>(sa); }
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../rak/socket_address.h: In member function 'rak::socket_address rak::socket_address_inet6::normalize_address() const':
../rak/socket_address.h:530:28: error: cast from 'const uint8_t*' {aka 'const unsigned char*'} to 'const uint32_t*' {aka 'const unsigned int*'} increases required alignment of target type [-Werror=cast-align]
530 | const uint32_t *addr32 = reinterpret_cast<const uint32_t *>(m_sockaddr.sin6_addr.s6_addr);
could this be related?
Check if this issue still appears with the latest master that has updated to c++17.
since this is related to a build issue, I am updating the github action in my repo to capture the "--enable-aligned" flag.
aded to the package builder, probably a good one to add to the build schecking github actions too ?
The configure flag was added specifically for these issues, so it should be added for those archs.