mold icon indicating copy to clipboard operation
mold copied to clipboard

Implement `--icf=safe`

Open rui314 opened this issue 3 years ago • 7 comments

Identical Code Folding (ICF) is a powerful optimization to reduce the size of a linker output. It merges functions that happen to be compiled to the identical machine code that behave exactly the same.

The downside of doing this is the optimization per se violates the specification of the C/C++ language specs. In these languages, taking pointers of two distinctive functions must result in two non-equivalent pointer values. However, if we optimize two distinctive functions into a single function, that resulting two pointers will have the same value.

Currently, mold supports only --icf=all. That instructs the linker to do the optimization anyway.

Other linkers, notably LLVM lld, supports --icf=safe. It uses .llvm_addrsig sections to identify functions that are safe to merge. Functions that are not mentioned in the section are not address-taken (no one takes its pointer), so they are safe to merge.

We should support --icf=safe in the same manner as LLVM lld.

rui314 avatar Apr 30 '22 09:04 rui314

.llvm_addrsig is currently an LLVM-specific section. If we implement the support of the section to mold, we should propose the same feature to gcc and possibly send a patch to gcc to implement it.

rui314 avatar May 01 '22 07:05 rui314

I'll try to implement this next week.

ishitatsuyuki avatar May 08 '22 06:05 ishitatsuyuki

I filed a feature request to GCC to ask if GCC can support .llvm_addrsig. Let's see how it goes.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105625

rui314 avatar May 17 '22 03:05 rui314

@ishitatsuyuki Are you interested in adding a .addrsig directive support to the GNU assembler?

rui314 avatar May 17 '22 03:05 rui314

Are you interested in adding a .addrsig directive support to the GNU assembler?

Sure. Not really familiar with the GNU codebase, but I could certainly try that.

ishitatsuyuki avatar May 17 '22 03:05 ishitatsuyuki

Nice! Then please go ahead and send a patch to GNU binutils. I'm totally unfamiliar with binutils too. Let me reopen this issue to track this feature request on our side.

rui314 avatar May 17 '22 03:05 rui314

I've submitted the first revision of the patch to binutils, and I'm currently addressing the review comments.

One of the topics raised was that .llvm_addrsig's design doesn't work well with objcopy and ld -r:

In my opinion, .llvm_addrsig is a poor design. Peter Collingbourne received multiple objections to the idea when it was proposed on the generic ABI list, but he went ahead anyway. (Well, maybe the code was accepted into llvm during the discussion.) He was even told how to implement the functionality correctly, by a toolchain expert. Quoting Cary: "A much simpler (and zero-cost, from an object file size point of view) solution would be to add FPTR relocations to the psABI, and use those for any references that would impose the address significance constraint." https://groups.google.com/d/msg/generic-abi/MPr8TVtnVn4/g5yBRRB5AAAJ H.J. Lu also agreed that new relocations could be added.

https://sourceware.org/bugzilla/show_bug.cgi?id=23817

It doesn't look like we're receiving strong objection against adding the LLVM extensions in binutils yet, but we need to be aware that adopting .llvm_addrsig might not be the best for various ELF toolchain compatibility.

ishitatsuyuki avatar May 25 '22 08:05 ishitatsuyuki