libtorrent
libtorrent copied to clipboard
pthread_rwlock_wrlock() is called with a NULL pointer.
libtorrent version (or branch): 2.0.6 platform/architecture: amd64 compiler and compiler version: gcc 11.3.0 Debian bug report https://bugs.debian.org/1013931 From the bug report
The stacktrace below suggests that this is related to cleanup coderegistered using atexit(3): pthread_rwlock_wrlock() is called with a
NULL pointer.
,----
| $ gdb --args server/nbdkit plugins/torrent/.libs/nbdkit-torrent-plugin.so --help
| GNU gdb (Debian 12.1-2) 12.1
| Copyright (C) 2022 Free Software Foundation, Inc.
| [...]
| (gdb) run
| Starting program: /home/bengen/p/deb/nbdkit/server/nbdkit plugins/torrent/.libs/nbdkit-torrent-plugin.so --help
| [Thread debugging using libthread_db enabled]
| Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
| nbdkit [-4|--ipv4-only] [-6|--ipv6-only]
| [-D|--debug PLUGIN|FILTER|nbdkit.FLAG=N]
| [...]
| torrent=<TORRENT> (required) Torrent or magnet link.
| file=DISK.iso File to serve within torrent.
| cache=DIR Set directory to store partial downloads.
|
| Program received signal SIGSEGV, Segmentation fault.
| __GI___pthread_rwlock_wrlock (rwlock=0x0) at pthread_rwlock_wrlock.c:27
| 27 pthread_rwlock_wrlock.c: No such file or directory.
| (gdb) bt
| #0 __GI___pthread_rwlock_wrlock (rwlock=0x0) at pthread_rwlock_wrlock.c:27
| #1 0x00007ffff6c8a8e9 in CRYPTO_THREAD_write_lock (lock=<optimized out>) at ../crypto/threads_pthread.c:112
| #2 0x00007ffff6ba7a03 in conf_modules_finish_int () at ../crypto/conf/conf_mod.c:524
| #3 0x00007ffff6ba8132 in CONF_modules_unload (all=1) at ../crypto/conf/conf_mod.c:482
| #4 0x00007ffff726af24 in boost::asio::ssl::detail::openssl_init_base::do_init::~do_init (this=0x55556157dc50, __in_chrg=<optimized out>)
| at /usr/include/boost/asio/ssl/detail/impl/openssl_init.ipp:90
| #5 std::_Sp_counted_ptr<boost::asio::ssl::detail::openssl_init_base::do_init*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (
| this=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:348
| #6 0x00007ffff726b09a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x55556157db90)
| at /usr/include/c++/11/bits/shared_ptr_base.h:168
| #7 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>)
| at /usr/include/c++/11/bits/shared_ptr_base.h:705
| #8 std::__shared_ptr<boost::asio::ssl::detail::openssl_init_base::do_init, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (
| this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:1154
| #9 std::shared_ptr<boost::asio::ssl::detail::openssl_init_base::do_init>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>)
| at /usr/include/c++/11/bits/shared_ptr.h:122
| #10 0x00007ffff7bb9f77 in __run_exit_handlers (status=status@entry=0, listp=0x7ffff7d4d738 <__exit_funcs>,
| run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
| #11 0x00007ffff7bba11a in __GI_exit (status=status@entry=0) at exit.c:139
| #12 0x000055555555a944 in main (argc=<optimized out>, argv=<optimized out>) at ./server/main.c:678
`----
so, this happens when boost.asio is unloading libcrypto. I wonder if this is caused by libtorrent also calling CRYPTO_cleanup_all_ex_data()
.
I believe both asio's unloading and libtorrent's cleanup are deprecated in newer versions of libcrypto. libtorrent protects the calls with:
#if !defined(OPENSSL_API_COMPAT) || OPENSSL_API_COMPAT < 0x10100000L
So, I'm thinking that maybe it's time to just remove this code in libtorrent, and possibly "leak" libcrypto resources when an old version is used. or perhaps just bump the dependency and say libtorrent requires a OpenSSL 1.1 or later.
could you test to see if this makes a difference? https://github.com/arvidn/libtorrent/pull/6941
it looks like it's still happening: https://github.com/arvidn/libtorrent/runs/7090009663?check_suite_focus=true#step:5:16618
From PR #6941 arvidn wants to merge 1 commit into RC_1_2 from remove-openssl-cleanup
I'm forwarding the bug submitter comment
Perhaps I'm missing something … it seems to me that this patch is
to be appplied onto version 1.2, not 2.0 – and the verrsion nbdkit was
most recently built with is 2.0.6.
yeah, I made the patch against 1.2 because the same problem happens there, and I regularly merge 1.2 into 2.0 and master. Either way, it didn't seem to make a different. It's possible it's a new version of OpenSSL and boost.asio that are incompatible. I haven't been able to reproduce it, but maybe with the latest version of openssl I might.
Hi, I am the person who reported the bug within Debian. (We can continue the conversation without having to rely on @marillat to relay bits and pieces.)
My best uneducateed guess at the moment is that the OpenSSL structure that contains the pthread_rwlock_t
might have been zeroed out on destruction, leading to a NULL pointer being passed to pthread_rwlock_wrlock()
when the corresponding cleanup function is called a second time.
I may have found hints towards what is going on here. Apparently OpenSSL's CONF_modules_unload is run twice (why?) through atexit
handlers and then a boost::asio::ssl
which also wants to run some cleanup code.
Breakpoint 1, CONF_modules_unload (all=1) at ../crypto/conf/conf_mod.c:482
482 ../crypto/conf/conf_mod.c: No such file or directory.
(gdb) bt
#0 CONF_modules_unload (all=1) at ../crypto/conf/conf_mod.c:482
#1 0x00007ffff6ba827e in ossl_config_modules_free () at ../crypto/conf/conf_mod.c:571
#2 0x00007ffff6c7d0b9 in OPENSSL_cleanup () at ../crypto/init.c:414
#3 OPENSSL_cleanup () at ../crypto/init.c:344
#4 0x00007ffff7bb9f77 in __run_exit_handlers (status=status@entry=0, listp=0x7ffff7d4d738 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
#5 0x00007ffff7bba11a in __GI_exit (status=status@entry=0) at exit.c:139
#6 0x000055555555a944 in main (argc=<optimized out>, argv=<optimized out>) at ./server/main.c:678
(gdb) c
Continuing.
Breakpoint 1, CONF_modules_unload (all=1) at ../crypto/conf/conf_mod.c:489
489 in ../crypto/conf/conf_mod.c
(gdb) bt
#0 CONF_modules_unload (all=1) at ../crypto/conf/conf_mod.c:489
#1 CONF_modules_unload (all=1) at ../crypto/conf/conf_mod.c:477
#2 0x00007ffff6ba827e in ossl_config_modules_free () at ../crypto/conf/conf_mod.c:571
#3 0x00007ffff6c7d0b9 in OPENSSL_cleanup () at ../crypto/init.c:414
#4 OPENSSL_cleanup () at ../crypto/init.c:344
#5 0x00007ffff7bb9f77 in __run_exit_handlers (status=status@entry=0, listp=0x7ffff7d4d738 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
#6 0x00007ffff7bba11a in __GI_exit (status=status@entry=0) at exit.c:139
#7 0x000055555555a944 in main (argc=<optimized out>, argv=<optimized out>) at ./server/main.c:678
(gdb) c
Continuing.
Breakpoint 1, CONF_modules_unload (all=1) at ../crypto/conf/conf_mod.c:482
482 in ../crypto/conf/conf_mod.c
(gdb) bt
#0 CONF_modules_unload (all=1) at ../crypto/conf/conf_mod.c:482
#1 0x00007ffff726af24 in boost::asio::ssl::detail::openssl_init_base::do_init::~do_init (this=0x55556157dc50, __in_chrg=<optimized out>)
at /usr/include/boost/asio/ssl/detail/impl/openssl_init.ipp:90
#2 std::_Sp_counted_ptr<boost::asio::ssl::detail::openssl_init_base::do_init*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (
this=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:348
#3 0x00007ffff726b09a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x55556157db90)
at /usr/include/c++/11/bits/shared_ptr_base.h:168
#4 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>)
at /usr/include/c++/11/bits/shared_ptr_base.h:705
#5 std::__shared_ptr<boost::asio::ssl::detail::openssl_init_base::do_init, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (
this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/shared_ptr_base.h:1154
#6 std::shared_ptr<boost::asio::ssl::detail::openssl_init_base::do_init>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>)
at /usr/include/c++/11/bits/shared_ptr.h:122
#7 0x00007ffff7bb9f77 in __run_exit_handlers (status=status@entry=0, listp=0x7ffff7d4d738 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
#8 0x00007ffff7bba11a in __GI_exit (status=status@entry=0) at exit.c:139
#9 0x000055555555a944 in main (argc=<optimized out>, argv=<optimized out>) at ./server/main.c:678
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
__GI___pthread_rwlock_wrlock (rwlock=0x0) at pthread_rwlock_wrlock.c:27
27 pthread_rwlock_wrlock.c: No such file or directory.
my suspicion is that as openssl deprecated CONF_modules_unload()
they made it be called by their own global destructor. However, when they did, they should have made the publicly visible CONF_modules_unload()
be a no-op, but perhaps they didn't. Perhaps most users of OpenSSL doesn't bother calling this function on shutdown, but if you do, the new OpenSSL might end up calling it again.
Patching the preprocessor magic around ::CONF_modules_unload(1);
in boost::asio::ssl::detail::openssl_init_base::do_init::~do_init
and rebuilding Boost and libtorrent fixed the issue for me. And Boost seem to have fixed this problem for OpenSSL >= 3, too: https://github.com/boostorg/asio/blob/46f49024e13ed8867cd48cabf1498a0e11db593d/include/boost/asio/ssl/detail/impl/openssl_init.ipp#L89
I just managed to reproduce this on MacOS as well, after updating OpenSSL.
My understanding is:
-
OpenSSL simplified their API by no longer requiring calling
CONF_modules_unload()
, but it seems like while doing so they also made it an error to call this function (even though it was OK in previous versions). I think this is the root of the issue. The OpenSSL API changed slightly. -
boost.asio calls
CONF_modules_unload()
in a global destructor, which happens to be after the now-built-in call happens, which crashes.
I think the best fix would be to patch OpenSSL to make CONF_modules_unload()
a no-op, and have it's at-exit handler call an internal function that does the same thing (i.e. just rename it).
The second best fix would be to patch boost.asio, as @hillu described.
I don't think there's anything I can do in libtorrent however.
one MacOS, I can work around this by installing brew install [email protected]
and the build libtorrent with:
b2 openssl-lib=$(brew --prefix [email protected])/lib openssl-include=$(brew --prefix [email protected])/include
(this is becasue [email protected]
won't be installed in /opt/homebrew/include
and /opt/homebrew/lib
).
Apparently on github actions, it's sufficient to just downgrade. https://github.com/arvidn/libtorrent/pull/6945
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been fixed.