patchelf.rb icon indicating copy to clipboard operation
patchelf.rb copied to clipboard

[Discussion] on ELFs failing to patch.

Open rmNULL opened this issue 5 years ago • 11 comments

Few ELFs fail with NotImplementedError when patched by patchelf.rb. From what i understand this happens when the expansion can't fit in an existing LOAD. As this is a documented and spec'd crash, i'll just move on to examples and discussion.


Crashing ELFs :face_with_head_bandage:

All fail with new_load_method: NotImplementedError,

Note: will update the list if more cases are found.


Steps :dancers:

  1. download, extract bottle
  2. patchelf.rb --force-rpath --set-runpath "home/linuxbrew/.linuxbrew/Cellar/node/14.4.0/lib:/home/linuxbrew/.linuxbrew/lib:/home/linuxbrew/.linuxbrew/opt/icu4c/lib" --set-interpreter "/home/linuxbrew/.linuxbrew/lib/ld.so" go/1.14.4/libexec/bin/godoc

Few questions :roll_eyes:

(to @david942j )

  1. I see that mgap moves the offsets by extension size of all headers beyond the LOAD segment being patched. Will it be correct to assume, we can create a new writable LOAD segment next to an existing one, and relocate by shift_attributes.

  2. Related to previous question, can i have an example on how this may go bad( sorry about being lazy and not figuring out myself) https://github.com/david942j/patchelf.rb/blob/9b486693aa1ed7e41b5d2a7901cf35eaea0fed8d/lib/patchelf/mm.rb#L153

  3. trivia: want a confirmation that f in f-gap and m in m-gap stands for file and memory.


Other

Implementing this seems a good start before moving on to fix segmentation errors. The issue was made on request of @sjackman who wished the failing ELFs to be publicly listed.

rmNULL avatar Jul 15 '20 19:07 rmNULL

cc @sjackman @woodruffw

rmNULL avatar Jul 15 '20 19:07 rmNULL

@david942j Can you recommend a document that I ought to read to get up to speed on ELF file layout? In which section and segment is the RPATH stored?

❯❯❯ readelf -l /home/linuxbrew/.linuxbrew/bin/hello | grep dynamic
   07     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 
   08     .init_array .fini_array .jcr .dynamic .got 
   09     .dynamic 

The .dynamic section appears to occur in three ELF segments, LOAD, DYNAMIC, and GNU_RELRO. Why is that?

sjackman avatar Jul 15 '20 21:07 sjackman

Sorry for late response, I haven't checked here for days.

I would need some time to recall what I was thinking about the new_load_method, so let me answer other easier questions first:

  1. trivia: want a confirmation that f in f-gap and m in m-gap stands for file and memory.

Yes they are.

Re @sjackman :

May not be a good suggestion but this is what I read for developing elftools - the ELF spec :p

An important concept is the "section" information is meaningless for an "executable" file. Sections are used in the "linking" stage, where the linker processes .o files into a .out (executable) file.

So the "correct" way to find what RPATH is:

  1. Check program headers to find the DYNAMIC segment
➔ readelf -l rpath.elf | grep DYNAMIC
  DYNAMIC        0x0000000000000d78 0x0000000000200d78 0x0000000000200d78
  1. At file offset 0xd78 is the dynamic table contains the list of tags, RPATH and RUNPATH information are stored here.
➔ readelf -d rpath.elf | grep RPATH
 0x000000000000000f (RPATH)              Library rpath: [/not_exists:/lib:/pusheen/is/fat]

RPATH and RUNPATH are not mandatory fields, so you may find binaries don't have these tags.

david942j avatar Jul 18 '20 03:07 david942j

  1. Related to previous question, can i have an example on how this may go bad( sorry about being lazy and not figuring out myself)

I think it just because this line assumes the segments list is not changed https://github.com/david942j/patchelf.rb/blob/9b486693aa1ed7e41b5d2a7901cf35eaea0fed8d/lib/patchelf/mm.rb#L159

david942j avatar Jul 18 '20 03:07 david942j

  1. I see that mgap moves the offsets by extension size of all headers beyond the LOAD segment being patched. Will it be correct to assume, we can create a new writable LOAD segment next to an existing one, and relocate by shift_attributes.

I think this is similar to what I want to do for new_load_method.

So to implement this method, we need to:

  1. Able to add a new segment entry to the segment header
    • this is non-trivia because this also implies the segment headers have to be moved to the new LOAD. e.g. receive DT (dynamic tags) changing request -> fgap, mgap methods failed -> new_load_method -> oh my, program headers need more space as well -> need more space for the new LOAD segment
  2. Decide where to put the LOAD segment for file and memory
    • for file it's easier, similar to the mgap method
    • for memory currently I feel the only choice is adding a LOAD before the first LOAD, which also implies we have to move the ELF header forwardly. IIRC Unix patchelf performs something similar to this.

If you want I can implement that TODO (but I would be pleased if you can help), just let me know if you have a deadline for the Google project :p

david942j avatar Jul 18 '20 04:07 david942j

If you want I can implement that TODO (but I would be pleased if you can help), just let me know if you have a deadline for the Google project :p

Sorry, it's gonna take a while to comprehend the strategy. I'll have a running(WIP) PR, if you can provide feedback and answer general queries it'll be helpful.

As for deadline, its somewhere in next 1 month and 10 days, we need this to be done and released somewhere in next 20-25 days so rest of the days can be spent integrating it into brew(which are small changes afaik) but testing usually consumes the bulk of time.

rmNULL avatar Jul 19 '20 15:07 rmNULL

@david942j please clear some doubts.

The following lines request to expand along with the existing string in strtab. https://github.com/david942j/patchelf.rb/blob/a5ddeb5f134bfa3b2fb974b0ef81ff00f8f758da/lib/patchelf/saver.rb#L187-L189

  • i can guess why this has to be done,we may have other content following it that may prevent us from inline patching in file(please confirm this)

the major doubt is when find_gap checks for free space it doesn't discount the sizes already considered.

All the numbers are in byte unit(won't repeat for brevity) e.g: say the strtab is of size 40 string to be patched is of size 12(including NUL) needed = 40 + 12 = 62

assume fgap = 16 , mgap = 32

In such scenario the patch will fail as we don't have a gap of 62. This shouldn't be happening as both fgap and mgap have sufficient space to fit the 12 byte string. The naive outlook suggests we fix this by moving the segments and creating a continuous free space, and readjusting address in headers. However this seems a non trivial task as we may have to handle edge cases i didn't even think of(top of my head, i see sections with relocatable address and progbits causing issue).

rmNULL avatar Jul 28 '20 07:07 rmNULL

Not sure if I understand your question..

i can guess why this has to be done,we may have other content following it that may prevent us from inline patching in file(please confirm this)

In such scenario the patch will fail as we don't have a gap of 62. This shouldn't be happening as both fgap and mgap have sufficient space to fit the 12 byte string.

IIUC you want to know why we need to consider the size of strtab_string (40 bytes in your example) when performing malloc_strtab!?

Say the original strtab_string is "libc.so.6\x00xyz\x00", and we'd like to add a string, maybe "libcrypto.so.1\x00". Then we need a string with content "libc.so.6\x00xyz\x00libcrpyto.so.1\x00". The original size of strtab_string should be counted.

Q: Why not append "libcrypto.so.1\x00" to the original "libc.so.6\x00xyz\x00" directly? A: There mostly are other headers / data after strtab_string, we can't simply overwrite those data.

Q: How about appending "libcrypto.so.1\x00" to the original "libc.so.6\x00xyz\x00" and moving all data / headers after strtab_string? (I just realized this should be what you are requesting) A: Can do. Though I'm not 100% sure whether adding a file offset to segments/sections is enough for this strategy.

david942j avatar Jul 28 '20 16:07 david942j

Q: Why not append "libcrypto.so.1\x00" to the original "libc.so.6\x00xyz\x00" directly? A: There mostly are other headers / data after strtab_string, we can't simply overwrite those data.

the confirmation i wanted.

Q: How about appending "libcrypto.so.1\x00" to the original "libc.so.6\x00xyz\x00" and moving all data / headers after strtab_string? (I just realized this should be what you are requesting) A: Can do. Though I'm not 100% sure whether adding a file offset to segments/sections is enough for this strategy.

ummm. So should we try this or follow the recipe you mentioned in the above reply.

Not sure if I understand your question..

will make a point to list summarized question after rambling.

rmNULL avatar Jul 28 '20 16:07 rmNULL

How about appending "libcrypto.so.1\x00" to the original "libc.so.6\x00xyz\x00" and moving all data / headers after strtab_string?

It's still possible we don't have enough space even adding this. The new_load_method is the most robust way to be used for all scenarios.

david942j avatar Jul 28 '20 17:07 david942j

A note to myself: patchelf.rb fails with NotImplementedError when patching the system glibc on Ubuntu 20.04 https://gitlab.com/david942j/libcdb/-/blob/master/libc/libc6_2.31-0ubuntu9_amd64/lib/x86_64-linux-gnu/libc-2.31.so

➔ bundle exec patchelf.rb --so libc.so.217 /lib/x86_64-linux-gnu/libc.so.6 libc.patch
Traceback (most recent call last):
	7: from /var/lib/gems/2.7.0/bin/patchelf.rb:23:in `<main>'
	6: from /var/lib/gems/2.7.0/bin/patchelf.rb:23:in `load'
	5: from /home/rev/patchelf.rb/bin/patchelf.rb:6:in `<top (required)>'
	4: from /home/rev/patchelf.rb/lib/patchelf/cli.rb:45:in `work'
	3: from /home/rev/patchelf.rb/lib/patchelf/patcher.rb:191:in `save'
	2: from /home/rev/patchelf.rb/lib/patchelf/saver.rb:43:in `save!'
	1: from /home/rev/patchelf.rb/lib/patchelf/mm.rb:52:in `dispatch!'
/home/rev/patchelf.rb/lib/patchelf/mm.rb:139:in `new_load_method': NotImplementedError (NotImplementedError)

david942j avatar Aug 27 '20 12:08 david942j