patchelf icon indicating copy to clipboard operation
patchelf copied to clipboard

Cleanstrtab

Open rohoog opened this issue 2 years ago • 7 comments
trafficstars

I have created the change that adds cleaning of the dynstr. All unreferenced symbols are removed if --clean-strtab option is given and the section is shrunk. It could solve issues #453 and #449.

However I found that the binary actually grows by about 4KB. I think this is due to this note section normalizing. I'm not sure why this is done and why this should be needed. Why not leave all sections before the one(s) that are modified alone?

rohoog avatar Mar 02 '23 21:03 rohoog

I also forgot to say. You might want to enable the CI in your github fork so that you can make sure all tests are passing.

brenoguim avatar Mar 06 '23 19:03 brenoguim

We are going to need to decide between my implementation and yours. Mine has the disadvantage of undoing the sharing, but I intend to implement sharing detection on it, which would then create sharing for the modifications. Yours persist the sharing, but would not add sharing. I suspect adding sharing detection on my implementation would be more fiting the algorithm, while in yours it would need a more significant rewrite. What do you think about it?

brenoguim avatar Mar 06 '23 19:03 brenoguim

Sorry, I didn't look into your implementation. Mine was a C++ learning experience for me. Do with it as you please. I addressed your review comments, except the split of the initial big commit and the adding of the test case (if you did a test-case, then that would probably work for my code too). Please have a look. Initially I also had the idea of adding the symbol sharing, until I found that the gnu binutils already does that, so I didn't bother. Since the 'patching' is about removing DT_NEEDED, I don't think there will be new opportunities for sym sharing.

I do find the growing of the binary disturbing. Also the number of program headers increases by the 'rewriteSections' method. Should I make a new issue for that?

rohoog avatar Mar 06 '23 21:03 rohoog

Since the 'patching' is about removing DT_NEEDED, I don't think there will be new opportunities for sym sharing.

Patchelf has many operations that will grow the dynstr table. For example, changing RPATH, or renaming symbols. So redoing the sharing could improve in those cases.

I do find the growing of the binary disturbing. Also the number of program headers increases by the 'rewriteSections' method. Should I make a new issue for that?

There is definitely room for improvement indeed. Patchelf will run a generic algorithm to change the place of modified sections, and to do that it needs to add more loads. But if you know that you did not grow any section, you can just tune the algorithm to avoid calling rewriteSections.

brenoguim avatar Mar 07 '23 00:03 brenoguim

My changes are all about shrinking sections, not growing. Do you mean that I don't need to call rewriteSections after shrinking? What else should I call to 'propagate' the shrinking into the section headers and program headers and yield a smaller binary?

r-hoogenboom avatar Mar 07 '23 08:03 r-hoogenboom

I understand that. I just meant that it would be useful to redo the sharing because some options in patchelf add new strings.

If you just set changed=true without rewriting the sections, the binary will have exactly the same size. At least it won't grow. Patchelf will just update the changed bytes in the file.

If you want to actually shrink, I guess this needs to be implemented.

It's complicated because if you want to keep the section in the same place as it was before and make the file smaller, that would mean all other sections would be loaded in an earlier address space, which breaks the binary. If you want to move the new smaller section to another place in the virtual space, you need to add a new program header table entry. And maybe there isn't room for that, forcing you to displace the next section into another location. But to displace sections, you might have to normalize note sections... So you get to the issue existing today.

brenoguim avatar Mar 07 '23 09:03 brenoguim

HMM, I expected that exactly this was the responsibility of rewriteSections. If it can deal with grown sections, why wouldn't it be able to deal with shrunk sections.

Note sections are just arbitrary 'advisory' non-critical things (like which assembler assembled the binary, which linker linked it, etc), right? Do they really refer to absolute locations of data in the binary?

I think patchelf is growing to be a full-blown linker (and un-linker) and by that, one that supports many architectures and bit-nesses. I think patchelf should leave stuff alone which it can, try to leave as much as possible to the linker that originally built the binary. But that's just my opinion.

r-hoogenboom avatar Mar 07 '23 10:03 r-hoogenboom