patchelf
patchelf copied to clipboard
patchelf doesn't update __ehdr_start symbol
Hi, I noticed that patchelf doesn't update __ehdr_start symbol. It breaks binaries built with clang (version >=13 ) with -fprofile-instr-generate flag. Here is the reproducer:
$ cat foo.c
int main(){}
$ clang --version
clang version 13.0.1 (https://github.com/yugabyte/llvm-project.git 191e3a05a55c8671bcc88d7387c04c55a4310500)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$ clang -fprofile-instr-generate foo.c -o foo
$ readelf -Wa foo |grep "PHDR\|ehdr"
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R 0x8
142: 0000000000400000 0 NOTYPE LOCAL DEFAULT 1 __ehdr_start
$ patchelf --version
patchelf 0.15.0
$ patchelf --set-interpreter /somepath/lib/ld.so foo
$ readelf -Wa foo |grep "PHDR\|ehdr"
PHDR 0x000040 0x00000000003ff040 0x00000000003ff040 0x000230 0x000230 R 0x8
142: 0000000000400000 0 NOTYPE LOCAL DEFAULT 1 __ehdr_start
As you can see __ehdr_start still points to the old (PHDR - offset) value. clang __llvm_write_binary_ids uses __ehdr_start to calculate program header and might segfault if it's the wrong one.
This patch fixes the problem for me. Please advise if this is the right one.
diff --git a/src/patchelf.cc b/src/patchelf.cc
index 49accae..ef7bae4 100644
--- a/src/patchelf.cc
+++ b/src/patchelf.cc
@@ -1059,6 +1059,8 @@ void ElfFile<ElfFileParamNames>::rewriteHeaders(Elf_Addr phdrAddress)
}
+ auto shdrSymStr = tryFindSectionHeader(".strtab");
+ char* strTab = shdrSymStr ? (char *)fileContents->data() + rdi((*shdrSymStr).get().sh_offset) : nullptr;
/* Rewrite the .dynsym section. It contains the indices of the
sections in which symbols appear, so these need to be
remapped. */
@@ -1069,6 +1071,13 @@ void ElfFile<ElfFileParamNames>::rewriteHeaders(Elf_Addr phdrAddress)
for (size_t entry = 0; (entry + 1) * sizeof(Elf_Sym) <= rdi(shdr.sh_size); entry++) {
auto sym = (Elf_Sym *)(fileContents->data() + rdi(shdr.sh_offset) + entry * sizeof(Elf_Sym));
unsigned int shndx = rdi(sym->st_shndx);
+ // Update the value of __ehdr_start to (PHDR - offset)
+ if (rdi(shdr.sh_type) == SHT_SYMTAB && shdrSymStr &&
+ !strcmp(strTab + rdi(sym->st_name), "__ehdr_start")) {
+ debug("updating __ehdr_start from 0x%llx to 0x%llx\n",
+ rdi(sym->st_value), phdrAddress - (hdr()->e_phoff));
+ wri(sym->st_value, phdrAddress - (hdr()->e_phoff));
+ }
if (shndx != SHN_UNDEF && shndx < SHN_LORESERVE) {
if (shndx >= sectionsByOldIndex.size()) {
fprintf(stderr, "warning: entry %d in symbol table refers to a non-existent section, skipping\n", shndx);
Actually my patch doesn't work in general...
__ehdr_start is loaded from using pc-relative load instructions, which are not updated by patchelf
403e2e: 0f b7 0d 03 c2 ff ff movzwl -0x3dfd(%rip),%ecx # 400038 <__ehdr_start+0x38>
So the patched application loads garbage. In my case I got 'lucky' and the wrong read led to the 0 value and we just skipped the loop (so there were no crash). But sometimes it's not 0.
Ideally patchelf should not change load addresses at all so that __ehdr_start does not need to be changed.
I also had this issue, __llvm_write_binary_ids getting SEGV after using patchelf to replace SONAME/NEEDED. I ended up manually replacing the strings (ensuring that they have the same length).