bitcoin
bitcoin copied to clipboard
Unnecessary symbols being exported in `libbitcoinconsensus.so`
Initially it was reported in https://github.com/bitcoin/bitcoin/pull/25020#issuecomment-1142151675.
$ objdump -T ./bitcoin-24.0.1/lib/libbitcoinconsensus.so | grep bitcoinconsensus_
000000000002b1f0 g DF .text 0000000000000037 Base bitcoinconsensus_version
000000000002cc30 g DF .text 000000000000006d Base bitcoinconsensus_verify_script
000000000002cbe0 g DF .text 0000000000000049 Base bitcoinconsensus_verify_script_with_amount
$ objdump -T ./bitcoin-24.0.1/lib/libbitcoinconsensus.so | grep -v UND
./bitcoin-24.0.1/lib/libbitcoinconsensus.so: file format elf64-x86-64
DYNAMIC SYMBOL TABLE:
000000000002cd20 w DF .text 00000000000000ce Base _ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EEOS8_PKS5_
000000000003a000 w DF .text 00000000000001e5 Base _ZNSt6vectorIS_IhSaIhEESaIS1_EE17_M_realloc_insertIJS1_EEEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEDpOT_
0000000000038590 w DF .text 0000000000000054 Base _ZNSt19bad_optional_accessD0Ev
0000000000022050 w DF .text 0000000000000034 Base _ZNKSt5ctypeIcE8do_widenEc
00000000001739a8 w DO .data.rel.ro 0000000000000018 Base _ZTISt19bad_optional_access
0000000000038540 w DF .text 0000000000000042 Base _ZNSt19bad_optional_accessD2Ev
00000000000285e0 w DF .text 0000000000000373 Base _ZNSt20__uninitialized_copyILb0EE13__uninit_copyIN9__gnu_cxx17__normal_iteratorIPK5CTxInSt6vectorIS4_SaIS4_EEEEPS4_EET0_T_SD_SC_
000000000002b1f0 g DF .text 0000000000000037 Base bitcoinconsensus_version
000000000000596f w DF .text 0000000000000048 Base _ZSt27__throw_bad_optional_accessv
000000000002cc30 g DF .text 000000000000006d Base bitcoinconsensus_verify_script
00000000000393c0 w DF .text 000000000000009e Base _ZNSt6vectorIS_IhSaIhEESaIS1_EED1Ev
0000000000044890 w DF .text 00000000000000d4 Base _ZNSt8__detail18__from_chars_digitIjEEbRPKcS2_RT_i
00000000000397f0 w DF .text 0000000000000120 Base _ZNSt6vectorIS_IhSaIhEESaIS1_EE9push_backERKS1_
0000000000039910 w DF .text 000000000000012d Base _ZNSt6vectorIS_IhSaIhEESaIS1_EE8_M_eraseEN9__gnu_cxx17__normal_iteratorIPS1_S3_EES7_
00000000001739f0 w DO .data.rel.ro 0000000000000028 Base _ZTVSt19bad_optional_access
000000000002cbe0 g DF .text 0000000000000049 Base bitcoinconsensus_verify_script_with_amount
0000000000038050 w DF .text 0000000000000039 Base _ZNKSt19bad_optional_access4whatEv
0000000000039460 w DF .text 00000000000000b6 Base _ZNSt6vectorIhSaIhEEC2ERKS1_
0000000000027ef0 w DF .text 0000000000000062 Base _ZNSt12_Vector_baseIhSaIhEED2Ev
0000000000023020 w DF .text 0000000000000073 Base _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED1Ev
0000000000039460 w DF .text 00000000000000b6 Base _ZNSt6vectorIhSaIhEEC1ERKS1_
0000000000039520 w DF .text 00000000000002cd Base _ZNSt6vectorIS_IhSaIhEESaIS1_EE17_M_realloc_insertIJRKS1_EEEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEDpOT_
0000000000038690 w DF .text 0000000000000062 Base _ZNSt6vectorIhSaIhEED1Ev
0000000000056f10 w DO .rodata 0000000000000018 Base _ZTSSt19bad_optional_access
00000000000221e0 w DF .text 00000000000000d3 Base _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1IS3_EEPKcRKS3_
0000000000044360 w DF .text 00000000000000dc Base _ZNSt8__detail18__from_chars_digitImEEbRPKcS2_RT_i
00000000000444e0 w DF .text 0000000000000178 Base _ZNSt6vectorISt4byteSaIS0_EE17_M_realloc_insertIJS0_EEEvN9__gnu_cxx17__normal_iteratorIPS0_S2_EEDpOT_
00000000000440f0 w DF .text 00000000000000e2 Base _ZNSt6vectorIhSaIhEE7reserveEm
0000000000038540 w DF .text 0000000000000042 Base _ZNSt19bad_optional_accessD1Ev
0000000000039e80 w DF .text 0000000000000178 Base _ZNSt6vectorIhSaIhEE17_M_realloc_insertIJhEEEvN9__gnu_cxx17__normal_iteratorIPhS1_EEDpOT_
00000000000221e0 w DF .text 00000000000000d3 Base _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2IS3_EEPKcRKS3_
000000000002d670 w DF .text 0000000000000373 Base _ZNSt20__uninitialized_copyILb0EE13__uninit_copyIPK5CTxInPS2_EET0_T_S7_S6_
000000000002d4a0 w DF .text 00000000000001cd Base _ZNSt6vectorIS_IhSaIhEESaIS1_EE17_M_realloc_insertIJEEEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEDpOT_
0000000000053964 g DF .fini 0000000000000000 Base _fini
0000000000004000 g DF .init 0000000000000000 Base _init
00000000000393c0 w DF .text 000000000000009e Base _ZNSt6vectorIS_IhSaIhEESaIS1_EED2Ev
0000000000039b30 w DF .text 0000000000000160 Base _ZNSt6vectorIS_IhSaIhEESaIS1_EE13_M_insert_auxIS1_EEvN9__gnu_cxx17__normal_iteratorIPS1_S3_EEOT_
0000000000039a40 w DF .text 00000000000000e3 Base _ZNSt6vectorIS_IhSaIhEESaIS1_EE8_M_eraseEN9__gnu_cxx17__normal_iteratorIPS1_S3_EE
0000000000039c90 w DF .text 00000000000001e5 Base _ZNSt6vectorIS_IhSaIhEESaIS1_EE17_M_default_appendEm
00000000000230a0 w DF .text 0000000000000081 Base _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED0Ev
0000000000027ef0 w DF .text 0000000000000062 Base _ZNSt12_Vector_baseIhSaIhEED1Ev
0000000000023020 w DF .text 0000000000000073 Base _ZNSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEED2Ev
0000000000038690 w DF .text 0000000000000062 Base _ZNSt6vectorIhSaIhEED2Ev
00000000000441e0 w DF .text 0000000000000178 Base _ZNSt6vectorIhSaIhEE17_M_realloc_insertIJRKhEEEvN9__gnu_cxx17__normal_iteratorIPhS1_EEDpOT_
0000000000043fc0 w DF .text 0000000000000129 Base _ZStplIcSt11char_traitsIcESaIcEENSt7__cxx1112basic_stringIT_T0_T1_EERKS8_PKS5_
The underlying problem is described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36022, and it was not considered as a bug.
I think that moving code that uses C++ template functions/classes out from the bitcoinconsensus.{h,cpp}
compilation unit, which were suggested in #24994, is a step in the right direction.
From the linked issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36022#c4:
For C++, types that are to be used across shared libraries have to be visible
Seems to suggest that it is a good thing for these symbols to be exported so the library and its callers use the same symbols, particularly type_info
symbols, and I guess generally so symbols in shared libraries have the same visibility as symbols in static libraries. So linking dynamically doesn't turn shared symbols into hidden symbols.
I don't know what the best approach is, but I guess my question is what is the harm of exporting these symbols?
Seems to suggest that it is a good thing for these symbols to be exported so the library and its callers use the same symbols, particularly type_info symbols
We purposefully export only a C api to avoid exactly these type issues :) The user's/lib's c++ impls should be completely firewalled from each-other.
So I agree that this is annoying and maybe worth a hack to fix. But I'd like to exhaust all other options first.
@ryanofsky
my question is what is the harm of exporting these symbols?
I can add nothing to the best practises:
- https://gcc.gnu.org/wiki/Visibility
- https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#index-fvisibility:
symbol visibility should be viewed as part of the API interface contract
Adding one more bit to complicate and maybe explain why I described the boundary as a firewall... We build libbitcoinconsensus.so
with guix using -static-libstdc++
(side-note, we should perhaps move this into the build-system rather than guix).
So the idea here is that the entire c++ implementation should be compiled into the shared lib with no unresolved c++ stdlib symbols, and exporting only C symbols. That way the end-user can use any c++ impl (if any) they want without fear of collisions. And because there's no cross-boundary c++ calls, there's no benefit to having the shared typeinfo/exception symbols.
Arguably this is different than the libstdc++ bug that was closed as WONTFIX because (at least as far as I can tell) we're trying to do something completely safe.
If it turns out that tight firewall idea doesn't work because libstdc++ is stubborn about exporting symbols, we may need to either:
- revisit the idea of
-static-libstdc++
and - Use the
--exclude-libs ALL
@hebasto has suggested
Adding one more bit to complicate
Thanks, I'll pile on some more. I've alluded to this in other threads (#24994), but still haven't seen a summary of:
- What currently happens with
lld
+libstdc++
? - What currently happens with
lld
+libc++
? - Maybe even, what happens with
mold
and either standard library? - What happens under LTO, in any combination of the above?
There is much more to consider here than "What does ld
happen to do", and that means any "fix" should also work for/not break other toolchain combinations as well.
The discussion seemed to land on, nothing can be wrong with ld
/libstdc++
, because a 15 year old thread says "RESOLVED INVALID
". However, it's not clear to me if, in the present day, that thread is still relevant/the expected behaviour, and, it's also completely possible that other bugs/changes in (expected) behaviour have emerged in ld
, or libstdc++
, over the past 15 years, leaving the discussion there outdated.
I will also mention, that we currently skip exports checks for RISC-V binaries, because of std lib related symbol exports (#28095), so whatever the issue is here, it's also not specific to, or limited to libraries.
Our current release process uses a very particular toolchain. While LTO might be considered as its update in the near future, it is not clear how other combinations of tools are related to this problem? Unless we would prefer to alter the release toolchain instead of moving some code into a separated translation unit to fix the shared library symbol visibility.
Our current release process
This isn't specific to the release process. Anyone can, and should expect to be able to build "working" libs/bins outside of Guix.
it is not clear how other combinations of tools are related to this problem?
If you "fix" anything by doing something that isn't supported by those tools, i.e use a special link/compile flag, or rely on ld or libstdc++ specific behaviors, you're going to break things/be incompatible with anyone using any other tools. Also, any change in our source code/build system may have side-effects that could break things for users of those tools that otherwise already worked perfectly fine.
If you "fix' anything by doing something that isn't supported by those tools, i.e use a special link/compile flag, or rely on ld or libc++ specific behaviors, you're going to break things/be incompatible with anyone using any other tools.
Right. Requiring that a fix does not introduce any regressions is obvious. By the way, there were no comments regarding any introduced break in https://github.com/bitcoin/bitcoin/pull/24994.
I mean, what is the point to analyze the current behavior with all possible tool combinations:
What currently happens with
lld
+libstdc++
?What currently happens with
lld
+libc++
?Maybe even, what happens with
mold
and either standard library?
?
If any of those combinations works, I doubt we will modify the release build system to use this working toolchain.
Thanks, I'll pile on some more. I've alluded to this in other threads (#24994), but still haven't seen a summary of:
* What currently happens with `lld` + `libstdc++`? * What currently happens with `lld` + `libc++`? * Maybe even, what happens with [`mold`](https://github.com/rui314/mold) and either standard library? * What happens under LTO, in any combination of the above?
I've done some tests on Ubuntu 22.04 using a simple model code, compiling and testing as follows:
g++ -fPIC --shared -fvisibility=hidden -o test_lib.so -I . test_lib.cpp
nm -C -D test_lib.so | grep " W "
g++
with -fuse-ld={bfd,gold,lld,mold}
does export unwanted symbols.
clang++
with libc++ or libstdc++ does not export unwanted symbols.
Can we close this following #29189 ?
Yep.