go icon indicating copy to clipboard operation
go copied to clipboard

proposal: debug/elf: add API to allow reading compressed ELF sections

Open brancz opened this issue 2 years ago • 2 comments
trafficstars

I would like to propose adding an API to the debug/elf package to fully read sections (compressed and uncompressed alike). Current APIs only allow reading uncompressed sections, or in select cases may handle compressed sections, but do not allow accessing the decompressed bytes of a compressed section.

Why?

I have recently come across the dwz tool, which some Linux distributions (eg. Fedora and Debian) use to optimize debuginfos (as in debuginfos that have already been split off of a binary using objcopy --only-keep-debug $FILE $FILE.debug).

dwz deduplicates as much data as possible and subsequently splits the original debuginfos into two files, a primary debuginfo file which has a .gnu_debugaltlink section added, which links to the second file, which is also a valid elf file, where the deduplicated data lives, the primary debuginfo can then have references to point to where to read deduplicated data from in the supplementary debuginfo file. An example: The primary debuginfo DWARF DIEs, let's say a DW_AT_name could not be a string, but instead an offset into the supplementary debuginfo file's .debug_str or decompressed .zdebug_str section, directing us to read a null-terminated string at that offset.

Once the bytes of the section are available writing the code to read null-terminated strings at an offset is simple and could have been done separately. If it was just uncompressed .debug_str we could just use the debug/elf package's functionality to read the section, however, the issue arises when the section is compressed, as the unexported field compressionOffset needs to be set appropriately to read its content successfully.

For background: I happen to work on the open-source continuous-profiler, Parca, which uses the debug/elf and debuf/dwarf packages for symbolizing profiling data, and we'd like to support symbolizing these cases.

I could see a couple of different APIs for me to achieve my goal, but the ability to read compressed sections would have a minimal API surface and would allow any very special cases that the go runtime itself doesn't need, to be handled separately.

Alternatively, I could imagine an API within the debug/dwarf package that exposed a reader to read strings at a given offset, but that would be easy to misunderstand and misuse I feel.

brancz avatar Feb 02 '23 19:02 brancz

Right now if you use the (*Section).Open method it will automatically decompress the section contents, if they are marked with the ELF standard SHF_COMPRESSED flag. Your mention of .zdebug_str makes me wonder whether dwz is generating files with the older mechanism in which the section name indicates that the contents are compressed. Right now we handle that case in the (*File).DWARF method. Perhaps we should also handle it in (*Section).Open: if the section name starts with ".z" and the section contents start with ZLIB then we automatically decompress the data.

Would that fix your problem?

ianlancetaylor avatar Feb 02 '23 22:02 ianlancetaylor

Just to reiterate you're suggesting to essentially move the offset handling done in the (*File).DWARF) function to the (*Section).Open function? That would absolutely solve my problem, and I'd be more than happy to submit a patch for it!

brancz avatar Feb 03 '23 09:02 brancz

OK, taking this out of the proposal process.

ianlancetaylor avatar Feb 03 '23 20:02 ianlancetaylor

This would be backwards incompatible and require a GODEBUG setting per #56986, right?

aarzilli avatar Feb 06 '23 07:02 aarzilli

It would be backwards incompatible, but not every single backward incompatibility requires a GODEBUG setting. We only need a GODEBUG setting if the change is likely to break a reasonable number of real programs. In this case that seems to me to be unlikely. There may well be programs that are prepared to uncompress the ELF section data themselves, but I would expect those problems to work correctly if the receive uncompressed data. Do you know of examples where that would not work?

ianlancetaylor avatar Feb 07 '23 05:02 ianlancetaylor

Do you know of examples where that would not work?

I do not, delve for example will assume the section is not compressed if it doesn't start with ZLIB: https://github.com/go-delve/delve/blob/4303ae45a8e2996b30d2318f239677a771aef9c1/pkg/dwarf/godwarf/sections.go#L88. However I don't think it would be too strange if some program existed that simply errored in that circumstance. The string ZLIB at the start of the section exists to determine the type of compression, if you didn't see ZLIB you could also assume that a different compression is being used (no other compression algorithms are allowed at the moment but someone could write that to future-proof their code).

The counterpoint to this is that moving decompression from debug/dwarf to debug/elf just provides a small convenience, users could always write the decompression themselves. Also does the auto-decompression only go in debug/elf or also debug/macho and debug/pe? Compilers other than go do not produce compressed zdebug sections in PE and Mach-O executables.

aarzilli avatar Feb 07 '23 09:02 aarzilli

The .zdebug compression is obsolete, as all new files should be using SHF_COMPRESSED. I don't think we need to introduce a GODEBUG setting because there might be hypothetical code that would be affected.

As far as I can see the suggested change doesn't affect debug/dwarf at all. The code that would change, besides (*elf.Section).Open, is (*elf.File).DWARF.

Given that (*elf.Section).Open already handles SHF_COMPRESSED sections, it seems appropriate to me that it should also handle .zdebug sections. The current approach seems inconsistent.

ianlancetaylor avatar Feb 07 '23 23:02 ianlancetaylor

Change https://go.dev/cl/513875 mentions this issue: debug/elf: uncompress .zdebug sections in Open

gopherbot avatar Jul 28 '23 00:07 gopherbot

Already fixed by https://go.dev/cl/480895.

ianlancetaylor avatar Jul 31 '23 18:07 ianlancetaylor