mold
mold copied to clipboard
mold error when building gcc with lto 'Assertion `!is_v2' failed.'
Hello @rui314,
First thanks for the great linker! actually wanted to build gcc with lto and bootstrap and faced into a error, i dont know if its configuring, gcc or mold related but here the error:
make[3]: Leaving directory '/tmp/makepkg/gcc/src/gcc-build/gcc'
/tmp/makepkg/gcc/src/gcc-build/./prev-gcc/xg++ -B/tmp/makepkg/gcc/src/gcc-build/./prev-gcc/ -B/usr/x86_64-pc-linux-gnu/bin/ -nostdinc++ -B/tmp/makepkg/gcc/src/gcc-build/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs -B/tmp/makepkg/gcc/src/gcc-build/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -I/tmp/makepkg/gcc/src/gcc-build/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I/tmp/makepkg/gcc/src/gcc-build/prev-x86_64-pc-linux-gnu/libstdc++-v3/include -I/tmp/makepkg/gcc/src/gcc/libstdc++-v3/libsupc++ -L/tmp/makepkg/gcc/src/gcc-build/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs -L/tmp/makepkg/gcc/src/gcc-build/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -no-pie -march=native -O3 -pipe -fstack-protector-strong --param=ssp-buffer-size=4 -fno-plt -fopenmp -pthread -Wno-error -w -fno-checking -flto=jobserver -frandom-seed=1 -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -DHAVE_CONFIG_H -static-libstdc++ -static-libgcc -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -o cc1plus \
cp/cp-lang.o c-family/stub-objc.o cp/call.o cp/class.o cp/constexpr.o cp/constraint.o cp/coroutines.o cp/cp-gimplify.o cp/cp-objcp-common.o cp/cp-ubsan.o cp/cvt.o cp/cxx-pretty-print.o cp/decl.o cp/decl2.o cp/dump.o cp/error.o cp/except.o cp/expr.o cp/friend.o cp/init.o cp/lambda.o cp/lex.o cp/logic.o cp/mangle.o cp/mapper-client.o cp/mapper-resolver.o cp/method.o cp/module.o cp/name-lookup.o cp/optimize.o cp/parser.o cp/pt.o cp/ptree.o cp/rtti.o cp/search.o cp/semantics.o cp/tree.o cp/typeck.o cp/typeck2.o cp/vtable-class-hierarchy.o attribs.o incpath.o c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-indentation.o c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o c-family/c-ada-spec.o c-family/c-ubsan.o c-family/known-headers.o c-family/c-attribs.o c-family/c-warn.o c-family/c-spellcheck.o i386-c.o glibc-c.o cc1plus-checksum.o libbackend.a main.o libcommon-target.a libcommon.a ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a ../libcody/libcody.a \
libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a -lisl -lmpc -lmpfr -lgmp -rdynamic -lz -lzstd
mold: elf/lto.cc:273: mold::PluginStatus mold::elf::get_symbols(const void*, int, mold::PluginSymbol*, bool) [with E = X86_64]: Assertion `!is_v2' failed.
collect2: fatal error: ld terminated with signal 6 [Aborted], core dumped
compilation terminated.
make[3]: *** [/tmp/makepkg/gcc/src/gcc/gcc/cp/Make-lang.in:136: cc1plus] Error 1
rm gfdl.pod gcc.pod gfortran.pod gcov-dump.pod gcov-tool.pod fsf-funding.pod gpl.pod cpp.pod gcov.pod lto-dump.pod gccgo.pod gdc.pod
make[2]: *** [Makefile:5005: all-stage2-gcc] Error 2
make[1]: *** [Makefile:30918: stage2-bubble] Error 2
make: *** [Makefile:31130: bootstrap] Error 2
==> ERROR: A failure occurred in build().
Aborting...
used latest mold commit 494b28cfb38c3291adeb7ea4ed1fc64f37846651
and gcc-12 `gcc version 12.0.1 20220421 (experimental) GCC', built on archlinux.
did built with following configuring flags:
--libdir=/usr/lib \
--libexecdir=/usr/lib \
--mandir=/usr/share/man \
--infodir=/usr/share/info \
--with-bugurl=https://bugs.archlinux.org/ \
--with-linker-hash-style=gnu \
--with-system-zlib \
--enable-__cxa_atexit \
--enable-cet=auto \
--enable-checking=release \
--enable-clocale=gnu \
--enable-default-pie \
--enable-default-ssp \
--enable-gnu-indirect-function \
--enable-gnu-unique-object \
--enable-linker-build-id
--enable-lto \
--enable-multilib \
--enable-plugin \
--enable-shared \
--enable-threads=posix \
--disable-libssp \
--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d \
--enable-bootstrap \
--with-ld=/usr/bin/mold \
--with-build-config=bootstrap-lto \
--enable-link-serialization=1 \
--disable-libstdcxx-pch \
--disable-werror \
gdc_include_dir=/usr/include/dlang/gdc"
Maybe @marxin could help.
Thanks and Regards.
IIRC, @marxin tried to build gcc using mold with LTO, and I believe it worked at the moment. Could it be a regression of gcc?
Well, note the current master (the upcoming GCC 12.1
) does not contain the implementation of get_symbols_v3
.
So can you please explain to me what the crash means? Maybe get_symbols
is used?
On second thought, it's likely an internal error of mold and not gcc's fault. I just want to confirm that you succeeded to build gcc with mold with LTO recently. If so, it's likely a mold's regression.
Oh, to be honest, I haven't tested LTO bootstrap of GCC (even with the queued packages prepared to GCC 13 like get_symbols_v3
addition). I can reproduce the issue, it's really about an object from an archive that has file.is_alive == false
.
OK, thank you for confirming! I'll take a look.
But I can confirm I can LTO bootstrap GCC with the GCC patch that introduces get_symbols_v3
!
Oh, but I end up with some undefined symbols:
mold: error: undefined symbol: /tmp/cc1uZOWP.ltrans17.ltrans.o: add_path(char*, incpath_kind, int, bool)
mold: error: undefined symbol: /tmp/cc1uZOWP.ltrans17.ltrans.o: add_path(char*, incpath_kind, int, bool)
mold: error: undefined symbol: /tmp/cc1uZOWP.ltrans17.ltrans.o: add_path(char*, incpath_kind, int, bool)
mold: error: undefined symbol: /tmp/cc1uZOWP.ltrans17.ltrans.o: add_path(char*, incpath_kind, int, bool)
mold: error: undefined symbol: /tmp/cc1uZOWP.ltrans17.ltrans.o: split_quote_chain()
mold: error: undefined symbol: /tmp/cc1uZOWP.ltrans17.ltrans.o: register_include_chains(cpp_reader*, char const*, char const*, char const*, int, int, int)
mold: error: undefined symbol: /tmp/cc1uZOWP.ltrans17.ltrans.o: add_path(char*, incpath_kind, int, bool)
collect2: error: ld returned 1 exit status
make[3]: *** [/home/marxin/Programming/gcc/gcc/cp/Make-lang.in:136: cc1plus] Error 1
About the register_include_chains
it's defined in libbackend.a
:
$ nm libbackend.a
...
incpath.o:
...
00000000 T _Z23register_include_chainsP10cpp_readerPKcS2_S2_iii
but I can't see the incpath.o
in cc1plus.res
and the symbol is marked as undefined:
7771 1 UNDEF _Z8add_pathPc12incpath_kindib
That's why we end up with the error. Do you need any of the files or can you reproduce it locally?
@marxin What are your configure options?
./gcc/configure --enable-languages=c,c++,fortran,jit --prefix=/home/marxin/bin/gcc --disable-multilib --enable-host-shared --disable-libsanitizer --enable-valgrind-annotations --with-ld=`which ld.mold` --with-build-config=bootstrap-lto
Actually I did again a try and facing in the following error:
mold: error: duplicate symbol: libgcc.a(getf2.o): getf2_s.o: __getf2
mold: error: duplicate symbol: libgcc.a(letf2.o): letf2_s.o: __letf2
GCC Version used: gcc version 12.0.1 20220426 (experimental) (GCC) Mold Commit: https://github.com/rui314/mold/commit/17907964ac4dcea06290902cfbfe0d9b02ec6e59
It's new issue #475.
That problem should be fixed now, so please try again.
The original issue should also be fixed in the above commit.
Stage 1 of the GCC compiler is opened and I've sent the get_symbols_v3
API addition patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593900.html
and I was able to LTO bootstrap GCC with all languages enabled:
~/Programming/gcc/configure --enable-languages=all --prefix=/home/marxin/bin/gcc --enable-host-shared --with-ld=
which ld.mold --with-build-config=bootstrap-lto
About the other changes mentioned in this thread. GCC community is not happy about the new hook: ld_plugin_version
:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593901.html
So I'm leaving that one. Apparently, other linkers also have conditional behavior based on if GCC or Clang is used.
How can I distinguish a GCC plugin with v2-only support from one with v3 support?
Currently, mold restart itself if a GCC LTO plugin is in use to reset the internal state of the plugin. That happens before we call all_symbols_read_hook
which is before get_symbols_v2
or get_symbols_v3
is called.
If get_symbols_v3
is available, we don't need to restart the linker, but to do so, we need to know if get_symbols_v3
is available before calling all_symbols_read_hook
.
Or, maybe we can restart the linker as soon as get_symbols_v2
is called for the first time. I'm not sure if that is a safe timing to call exec
though (doesn't GCC leave temporary files for example?)
How can I distinguish a GCC plugin with v2-only support from one with v3 support?
Oh, I forgot about this need. So if you want I can suggest adding a new symbol supports_get_symbols_v3
that would tell you that. That's something local to GCC plug-in and does not need a plugin API change.
Will you be interested in that?
Oh, I forgot about this need. So if you want I can suggest adding a new symbol supports_get_symbols_v3 that would tell you that. That's something local to GCC plug-in and does not need a plugin API change. Will you be interested in that?
Yes! That seems more robust than the workaround that I implemented in https://github.com/rui314/mold/commit/38f2b965dbd7ea40f4c155f82e5a0f27cad07e16.
All right, I've suggested that: https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594012.html
I think you should instead try not advertising LDPT_GET_SYMBOLS or LDPT_GET_SYMBOLS_V2 in the onload transfer vector and if that gets you a LDPT_OK you know it will not be called. If the onload fails, you can do the reverse and drop LDPT_GET_SYMBOLS_V3. You might need to unload/reload the plugin on failure, not sure. Looking at GCCs implementation it doesn't clear variables at start and would do some redundant getenv work. But I think the plugin cannot assume that dlclose()/dlopen() will actually unmap/remap its image and thus onload() starts fresh.
As you wrote, since there's no guarantee what the plugin is after onload
failure, we probably need to dlclose and dlopen to reset the state. But dlclose/dlopen are not guaranteed to reset the state. musl libc for example doesn't unload a shared library on dlclose. So reloading a shared object file is not a reliable way to reset the internal state too.
Yes, ideally we'd extend the plugin API to make such retried onload() well-defined, for example by adding onload_v2 () that will reset state when called (with possibly leaving behavior undefined when any further operation has progressed already). But when adding onload_v2 () one could as well allow the plugin to communicate back the set of APIs used by adding an output parameter where it can specify the target vector entries that will be used.
That said, for current API and existing plugins it might be a workable heuristic to call onload() multiple times.
For doing changes to the API a clean design is warranted, a global symbol just indicating whether _v3 is used solelyt isn't.
Speaking of the plugin API itself, I found it very peculiar and hard to use. The plugin exports only the onload
function, and other plugin functions are returned to the onload
caller via a vector of function pointers. I don't see any benefit of exporting features from the linker plugin this way. IMO, the plugin could export one function for each functionality, e.g. add_input_file
, all_symbols_read
, register_get_symbols_callback_v2
, etc. If it did, we could have just use dlsym(handler, "register_get_symbols_callback_v3")
to see if the plugin provides that function.
So, if adding a symbol doesn't look clean, I'd suggest we redesign the whole plugin API. We should eliminate the global state from the plugin and export more symbols from the plugin.
That said, I doubt it would worth the effort. Effectively, this plugin API is used only by GNU ld, GNU gold and mold. For these linkers, adding a marker symbol should suffice, and IMHO it's actually a cleaner solution than using more complicated and unreliable mechanism to detect the presence of v3 API.
I didn't design the API but incremental things should follow the design spirit. So instead of a new "flag" symbol you'd add a register_get_symbols_api_use API that the plugin then calls when available, specifying the API version of the get_symbols hook it will actually use. Or alternatively a more broad register_get_api_usage which provides an get_api_usage () hook like
enum ld_plugin_status get_api_usage (ld_plugin_tag which);
which would return LDPS_OK for used and LDPS_ERR for not used (or not known) variants.
IMHO, it's an intricate way to obtain a single bit information (whether or not a given plugin supports the v3 API), but defining a new callback will work for us. If it's implemented in GCC LTO plugin, we are happy to use it.
Should be the issue reopened again after reverting the commit? So probably the issue will be present again, right ? Or is just the v3 API affected ?
I will check the current status, but reverting a patch shouldn't harm any existing GCC users. It's that mold now always assumes that gcc supports only the v2 API.
@rui314 Can you please experiment with the latest suggested plug-in extension patches: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596856.html ?
Just a quick note, 2 of 3 patches are upstreamed and I'm right now waiting for your feedback about the LDPT_GET_API_VERSION
. Any estimation when can you get to that?
@marxin Sorry I was working on mold/macOS. I'll try that this week.
I sent a reply to the gcc-patches mailing list.
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597518.html