ndk
ndk copied to clipboard
[BUG] Hoist relocation packing flags into the Clang driver
These aren't useful until minSdkVersion >= 28, but we should hoist the -Wl,--pack-dyn-relocs=android+relr and -Wl,--use-android-relr-tags into the driver. So they are used automatically when available.
why 28? i thought it was 23? that's certainly what's in the documentation --- https://android.googlesource.com/platform/bionic/+/master/android-changes-for-ndk-developers.md#relative-relocations-relr
is the confusion that pre-28 you also need -Wl,--use-android-relr-tags? (see internal bug http://b/147452927 "remove the need for -Wl,--use-android-relr-tags".)
ah, yeah, looks like this bug was filed a year before we learned that pcc@ had implemented the older form of relocation packing in lld. (follow the comments on the internal bug for the gory details.)
I may have filed this before we realized that LLD supported the API 23 stuff natively.
Yes, that :)
Drive-by comments (I come across here + I don't know where I should report the problems other than here:) )
--pack-dyn-relocs=android+relr may not be a good combination.
First, non -z options are binary format generic while --pack-dyn-relocs is ELF specific. (GNU ld maintainers will certainly object to it. See the --force-bti -> -z force-bti precedent)
Second, android+relr combines two not-too-related things. (In addition, relr+android does not work. Why?)
-z relr may be good for he --pack-dyn-relocs=relr part.
See https://reviews.llvm.org/D79664 and https://sourceware.org/pipermail/binutils/2020-May/111086.html (I want GNU ld to support at least the REL and RELR features)
Isn't that another argument for hoisting this into the driver? To avoid this being hard-coded in even more build systems than it already is?
On Fri, May 15, 2020, 11:02 Fangrui Song [email protected] wrote:
Drive-by comments (I come across here + I don't know where I should report the problems other than here:) )
--pack-dyn-relocs=android+relr may not be a good combination.
First, non -z options are binary format generic while --pack-dyn-relocs is ELF specific. Second, android+relr combines two not too related things. (In addition, relr+android does not work. Why?)
-z relr may be good for he --pack-dyn-relocs=relr part.
See https://reviews.llvm.org/D79664 and https://sourceware.org/pipermail/binutils/2020-May/111086.html (I want GNU ld to support at least the REL and RELR features)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/android/ndk/issues/909#issuecomment-629400830, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMVLEWDM2OBAY6NNDPKPLITRRV7U5ANCNFSM4GXSQTXQ .
I think the point is more what we should hoist into the driver. The driver can't currently make very reliable assumptions about what linker is being used.
Optimistically triaging to r23. Here's a patch we can send once we no longer support gold/bfd:
From 133de77b4fd8317979616e7a7d46f5a1ff953f9c Mon Sep 17 00:00:00 2001
From: Dan Albert <[email protected]>
Date: Thu, 25 Jun 2020 16:43:01 -0700
Subject: [PATCH] Pack relocations for Android when possible.
First supported by API 28.
---
clang/lib/Driver/ToolChains/Linux.cpp | 11 +++++++++++
clang/test/Driver/linux-ld.c | 16 ++++++++++++++++
2 files changed, 27 insertions(+)
diff --git a/clang/lib/Driver/ToolChains/Linux.cpp b/clang/lib/Driver/ToolChains/Linux.cpp
index 180350476c3..841572e70b0 100644
--- a/clang/lib/Driver/ToolChains/Linux.cpp
+++ b/clang/lib/Driver/ToolChains/Linux.cpp
@@ -232,6 +232,17 @@ Linux::Linux(const Driver &D, const llvm::Triple &Triple, const ArgList &Args)
ExtraOpts.push_back("relro");
}
+ if (Triple.isAndroid() && !Triple.isAndroidVersionLT(28)) {
+ // Android supports relr packing starting with API 28 and had its own flavor
+ // (--pack-dyn-relocs=android) starting in API 23. It's possible to use both
+ // with --pack-dyn-relocs=android+relr, but we need to gather some data on
+ // the impact of that form before we can know if it's a good default.
+ //
+ // On the other hand, relr should always be an improvement.
+ ExtraOpts.push_back("--use-android-relr-tags");
+ ExtraOpts.push_back("--pack-dyn-relocs=relr");
+ }
+
// Android ARM/AArch64 use max-page-size=4096 to reduce VMA usage. Note, lld
// from 11 onwards default max-page-size to 65536 for both ARM and AArch64.
if ((Triple.isARM() || Triple.isAArch64()) && Triple.isAndroid()) {
diff --git a/clang/test/Driver/linux-ld.c b/clang/test/Driver/linux-ld.c
index ec539522c25..cf8752b4a27 100644
--- a/clang/test/Driver/linux-ld.c
+++ b/clang/test/Driver/linux-ld.c
@@ -1092,6 +1092,22 @@
// CHECK-ANDROID-HASH-STYLE-M: "{{.*}}ld{{(.exe)?}}"
// CHECK-ANDROID-HASH-STYLE-M: "--hash-style=gnu"
+// Check that we pass --pack-dyn-relocs=relr for API 28+ and not before.
+// RUN: %clang %s -### -o %t.o 2>&1 \
+// RUN: --target=armv7-linux-android27 \
+// RUN: | FileCheck --check-prefix=CHECK-ANDROID-RELR-27 %s
+// CHECK-ANDROID-RELR-27: "{{.*}}ld{{(.exe)?}}"
+// CHECK-ANDROID-RELR-27-NOT: "--pack-dyn-relocs=relr"
+// CHECK-ANDROID-RELR-27-NOT: "--pack-dyn-relocs=android+relr"
+//
+// RUN: %clang %s -### -o %t.o 2>&1 \
+// RUN: --target=armv7-linux-android28 \
+// RUN: | FileCheck --check-prefix=CHECK-ANDROID-RELR-28 %s
+// CHECK-ANDROID-RELR-28: "{{.*}}ld{{(.exe)?}}"
+// CHECK-ANDROID-RELR-28: "--use-android-relr-tags"
+// CHECK-ANDROID-RELR-28: "--pack-dyn-relocs=relr"
+// CHECK-ANDROID-RELR-28-NOT: "--pack-dyn-relocs=android+relr"
+
// RUN: %clang %s -### -o %t.o 2>&1 \
// RUN: --target=armv7-linux-android21 \
// RUN: | FileCheck --check-prefix=CHECK-ANDROID-NOEXECSTACK %s
--
2.27.0.111.gc72c7da667-goog
https://reviews.llvm.org/D95492
This has slipped a few times because upstream won't allow us to assume LLD for Android, which makes it impossible to add any default flags that are only understood by LLD. There's a plan to fix that (redo how Clang thinks about default linkers by making it a target specific property), but no one has had time to chase that. Doing that is actually a patch that I'd previously sent to LLVM but was rejected as not needed, so we need to convince upstream that it in fact is needed.
Beyond that, someone (@rprichard?) once pointed out to me that relocation packing isn't necessarily better for performance. It's likely a better default for most apps that are not cautiously trimming their public API surface, but in the ideal case where an app has only a single shared library (with all dependencies statically linked into that library) and the only public function in that library is JNI_OnLoad, there may be few enough relocations for the packing to be a net negative? Something someone should test before making this the default. I have some notes in my backlog from someone with some advice on doing that:
bionic/benchmarks/linker_relocation run_bench_with_ninja.sh script
That script will create a synthetic benchmark from libandroid_servers.so, but it can be adjusted to work with some other shared object.
but in the ideal case where an app has only a single shared library ... there may be few enough relocations for the packing to be a net negative?
normally i'd trust rprichard's gut over mine in all things linker, but ... if that were true, why was chrome so keen on relocation packing?
normally i'd trust rprichard's gut over mine in all things linker, but ... if that were true, why was chrome so keen on relocation packing?
Chrome has an abnormal number of relocations, aiui.
Chrome has an abnormal number of relocations, aiui.
exactly. and they might be bloated, but they're not idiots, so my assumption is that your "isn't necessarily better" argument is probably flawed :-)
Big things like Chrome are the type of app that I expect to benefit from this. It's the smaller ones where it would maybe be a net negative. Or maybe what you're getting at is that if it's small enough for packed relocations to be harmful, the load time is probably already fast enough for that harm to be unnoticeable. I'd buy that.
once pointed out to me that relocation packing isn't necessarily better for performance
Packed relocations don't add a significant fixed cost (run-time or size), so I'm not worried about small libraries.
My worry was that the for_all_packed_relocs() linker function would make each relocation slower to process, so loading a library with many relocations might be slower. OTOH, I'd guess that the CPU decoding time is mostly irrelevant compared to the time required to read the relocation data. (Especially if it has to be read from flash...)