mold icon indicating copy to clipboard operation
mold copied to clipboard

Add -z nostart-stop-gc

Open MaskRay opened this issue 3 years ago • 5 comments

See https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order

GNU ld and gold have a behavior that a __start_/__stop_ reference from a live input section retains all related C identifier name sections. This is the -z nostart-stop-gc behavior.

Newer ld.lld uses the -z start-stop-gc behavior by default.

Some older packages which don't properly use SHF_GNU_RETAIN either due to their own reason or assembler support issue can use -z nostart-stop-gc.

MaskRay avatar Aug 03 '22 05:08 MaskRay

I don't think we need it because mold can garbage-collect C-identifier sections. I implemented the feature in b23c47ae6dbe285e02627803eac0604cfaaa1e78. What we are doing in this patch is that we mark section foo as alive if and only if there is __start_foo or __stop_foo. So, if a section is referenced, it'll be kept, and if not, it'll be garbage-collected.

rui314 avatar Aug 03 '22 06:08 rui314

Please take a look at this test case: https://github.com/rui314/mold/blob/main/test/elf/gc-sections-start-stop-symbols.sh

mold can correctly keep foo_section and discard bar_section. lld-14 seems to discard foo_section so I got the following error.

+ cc -fuse-ld=lld -o out/test/elf/x86_64/gc-sections-start-stop-symbols/exe out/test/elf/x86_64/gc-sections-start-stop-symbols/a.o out/test/elf/x86_64/gc-sections-start-stop-symbols/b.o -Wl,-gc-sections
ld.lld: error: undefined symbol: __start_foo_section
>>> referenced by -
>>>               out/test/elf/x86_64/gc-sections-start-stop-symbols/b.o:(main)
>>> the encapsulation symbol needs to be retained under --gc-sections properly; consider -z nostart-stop-gc (see https://lld.llvm.org/ELF/start-stop-gc)

It looks like mold's behavior is desirable. Am I missing something?

rui314 avatar Aug 03 '22 07:08 rui314

New lld's behavior (-z start-stop-gc, i.e. traditional GNU ld behavior) is desirable (i.e. you may close the llvm-project issue). Having __start_foo_section references doesn't mean that all foo_section sections need to be retained. Each should be inspected and retained/discarded with the usual rule. To keep such a section live, use SHF_GNU_RETAIN.

For example, foo_section may be referenced by a text section and the intention is for a discarded text section to discard the associated foo_section. If foo_section is forced live by __start_foo_section, there will be dangling foo_section.

MaskRay avatar Aug 04 '22 07:08 MaskRay

For example, foo_section may be referenced by a text section and the intention is for a discarded text section to discard the associated foo_section. If foo_section is forced live by __start_foo_section, there will be dangling foo_section.

We can prevent such dangling section to occur if we do precise mark-sweep, no? I didn't implement it because I thought that the loss would be negligible, but we can keep section foo if and only if a live section refers a symbol __start_foo or __stop_foo.

rui314 avatar Aug 04 '22 07:08 rui314

but we can keep section foo if and only if a live section refers a symbol __start_foo or __stop_foo.

Checking "live" is the -z nostart-stop-gc behavior. Again, it is not ideal for certain metadata uses. __start_/__stop_ have no relation with input sections. They are only defined when the output section exists (i.e. at least one input section retains).

The loss is not negligible, but the harm can degrade to a minimum if you don't let __start_ retain SHF_GROUP C identifier name sections. But this makes SHF_GROUP sections unnecessarily different from non-SHF_GROUP sections.

MaskRay avatar Aug 04 '22 08:08 MaskRay