patchelf icon indicating copy to clipboard operation
patchelf copied to clipboard

patchelf doesn't update __ehdr_start symbol

Open NatashaSerebryanaya opened this issue 3 years ago • 3 comments

Hi, I noticed that patchelf doesn't update __ehdr_start symbol. It breaks binaries built with clang (version >=13 ) with -fprofile-instr-generate flag. Here is the reproducer:

$ cat foo.c
int main(){}
$ clang --version
clang version 13.0.1 (https://github.com/yugabyte/llvm-project.git 191e3a05a55c8671bcc88d7387c04c55a4310500)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$ clang -fprofile-instr-generate foo.c -o foo
$ readelf -Wa foo |grep "PHDR\|ehdr"
 PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R   0x8
   142: 0000000000400000     0 NOTYPE  LOCAL  DEFAULT    1 __ehdr_start
$ patchelf --version
patchelf 0.15.0   
$ patchelf --set-interpreter /somepath/lib/ld.so foo
$ readelf -Wa foo |grep "PHDR\|ehdr"
  PHDR           0x000040 0x00000000003ff040 0x00000000003ff040 0x000230 0x000230 R   0x8
   142: 0000000000400000     0 NOTYPE  LOCAL  DEFAULT    1 __ehdr_start

As you can see __ehdr_start still points to the old (PHDR - offset) value. clang __llvm_write_binary_ids uses __ehdr_start to calculate program header and might segfault if it's the wrong one.

This patch fixes the problem for me. Please advise if this is the right one.

diff --git a/src/patchelf.cc b/src/patchelf.cc
index 49accae..ef7bae4 100644
--- a/src/patchelf.cc
+++ b/src/patchelf.cc
@@ -1059,6 +1059,8 @@ void ElfFile<ElfFileParamNames>::rewriteHeaders(Elf_Addr phdrAddress)
     }
 
 
+    auto shdrSymStr = tryFindSectionHeader(".strtab");
+    char* strTab = shdrSymStr ? (char *)fileContents->data() + rdi((*shdrSymStr).get().sh_offset) : nullptr;
     /* Rewrite the .dynsym section.  It contains the indices of the
        sections in which symbols appear, so these need to be
        remapped. */
@@ -1069,6 +1071,13 @@ void ElfFile<ElfFileParamNames>::rewriteHeaders(Elf_Addr phdrAddress)
         for (size_t entry = 0; (entry + 1) * sizeof(Elf_Sym) <= rdi(shdr.sh_size); entry++) {
             auto sym = (Elf_Sym *)(fileContents->data() + rdi(shdr.sh_offset) + entry * sizeof(Elf_Sym));
             unsigned int shndx = rdi(sym->st_shndx);
+            // Update the value of __ehdr_start to (PHDR - offset)
+            if (rdi(shdr.sh_type) == SHT_SYMTAB && shdrSymStr &&
+                  !strcmp(strTab + rdi(sym->st_name), "__ehdr_start")) {
+                  debug("updating __ehdr_start from 0x%llx to 0x%llx\n",
+                           rdi(sym->st_value), phdrAddress - (hdr()->e_phoff));
+                  wri(sym->st_value, phdrAddress - (hdr()->e_phoff));
+            }
             if (shndx != SHN_UNDEF && shndx < SHN_LORESERVE) {
                 if (shndx >= sectionsByOldIndex.size()) {
                     fprintf(stderr, "warning: entry %d in symbol table refers to a non-existent section, skipping\n", shndx);

NatashaSerebryanaya avatar Aug 17 '22 02:08 NatashaSerebryanaya

Actually my patch doesn't work in general... __ehdr_start is loaded from using pc-relative load instructions, which are not updated by patchelf

 403e2e:       0f b7 0d 03 c2 ff ff    movzwl -0x3dfd(%rip),%ecx        # 400038 <__ehdr_start+0x38>

So the patched application loads garbage. In my case I got 'lucky' and the wrong read led to the 0 value and we just skipped the loop (so there were no crash). But sometimes it's not 0.

NatashaSerebryanaya avatar Aug 29 '22 21:08 NatashaSerebryanaya

Ideally patchelf should not change load addresses at all so that __ehdr_start does not need to be changed.

Mic92 avatar Oct 09 '22 12:10 Mic92

I also had this issue, __llvm_write_binary_ids getting SEGV after using patchelf to replace SONAME/NEEDED. I ended up manually replacing the strings (ensuring that they have the same length).

clee704 avatar Mar 17 '24 00:03 clee704