cosmopolitan icon indicating copy to clipboard operation
cosmopolitan copied to clipboard

Missing line numbers when using gcc 11 or later

Open notwa opened this issue 1 year ago • 4 comments

line numbers are missing from backtraces and gdb fails to load any debug information. I suspect this happens when amalgamation builds are used with recent versions of gcc, but I haven't ruled out other possibilities (linker versions? Alpine-specific patches?).

I can reproduce this issue, for instance, using a Dockerfile:
FROM alpine:3.16

RUN apk add --no-cache gcc gdb

ADD https://justine.lol/cosmopolitan/cosmopolitan-amalgamation-2.0.1.zip \
    cosmopolitan-amalgamation-2.0.1.zip

ADD https://justine.lol/ape.elf /usr/bin/ape

RUN unzip cosmopolitan-amalgamation-2.0.1.zip && chmod +x /usr/bin/ape

RUN printf %s $'\
main() {\n\
\tprintf("hello world\\n");\n\
\t__die();\n\
}\n' >hello.c \
 && gcc -g -Os -static -nostdlib -nostdinc -fno-pie -no-pie -mno-red-zone \
    -fno-omit-frame-pointer -pg -mnop-mcount -mno-tls-direct-seg-refs \
    -o hello.com.dbg hello.c -fuse-ld=bfd -Wl,-T,ape.lds -Wl,--gc-sections \
    -include cosmopolitan.h crt.o ape-no-modify-self.o cosmopolitan.a

RUN objcopy -S -O binary hello.com.dbg hello.com

# never fail, so one may extract these files later.
RUN ./hello.com || true

RUN gcc --version \
 && gdb -ex 'p main' -ex quit hello.com.dbg
when built, this produces the output:
$ podman build -t cosmo-test .
[snip]
STEP 8/9: RUN ./hello.com || true
hello world
addr2line: DWARF error: can't find .debug_line_str section.
addr2line: DWARF error: can't find .debug_line_str section.
addr2line: DWARF error: can't find .debug_rnglists section.0x0000000000402796:
main at ??:?
0x000000000040281b: cosmo at ??:?
0x0000000000402443: _start at ??:?
--> 079f360a008
STEP 9/9: RUN gcc --version  && gdb -ex 'p main' -ex quit hello.com.dbg
gcc (Alpine 11.2.1_git20220219) 11.2.1 20220219
[snip]

GNU gdb (GDB) 11.2
[snip]
Reading symbols from hello.com.dbg...

warning: Loadable section ".tdata" outside of ELF segments
  in /hello.com.dbg
Dwarf Error: DW_FORM_line_strp used without required section
(No debugging symbols found in hello.com.dbg)
$1 = {<text variable, no debug info>} 0x40277f <main>
COMMIT cosmo-test

[snip]

(aside: in some cases, the addr2line errors are squelched, so all I had to work with for a while was just gdb's cryptic error message.) gdb produces the same error even on a Windows host. but here's the trick: when I replace FROM alpine:3.16 with FROM alpine:3.15, this issue disappears.

Alpine 3.15:  gcc (Alpine 10.3.1_git20211027) 10.3.1 20211027
Alpine 3.16:  gcc (Alpine 11.2.1_git20220219) 11.2.1 20220219

this might be an issue outside of cosmopolitan's scope, but at least my post may contain useful information. in the meantime, I'll switch to using the statically-built tools included in the repository.

notwa avatar Sep 06 '22 11:09 notwa

looking at this again, I think I got it. it doesn't seem to be an issue with the difference in linkers. gcc is producing different debug sections in object files as of version 11.

hello.o section names, from gcc 10 to gcc 11:

--- hello-gcc10.txt
+++ hello-gcc11.txt
@@ -7,9 +7,10 @@
 .debug_info
 .debug_abbrev
 .debug_aranges
-.debug_ranges
+.debug_rnglists
 .debug_line
 .debug_str
+.debug_line_str
 .comment
 .note.GNU-stack
 .note.gnu.property

oh, the errors make sense to me now. what if it's as simple as editing ape.lds?

sed -i '/debug_line 0/a\\  .debug_line_str 0 : { *(.debug_line_str) }' ape.lds
sed -i '/debug_line 0/a\\  .debug_rnglists 0 : { *(.debug_rnglists) }' ape.lds
hello world
0x0000000000402796: main at //hello.c:3
0x000000000040281b: cosmo at libc/runtime/cosmo.S:77
0x0000000000402443: _start at libc/crt/crt.S:103

ah, it works! but what if we don't want those newfangled sections to begin with?

gcc -gdwarf-4 does the trick. this flag even seems to exist as early as gcc 4.8.5 (I am not going to check earlier versions).

so, I'm thinking of potential solutions:

  • suggest -gdwarf-4 with amalgamation builds in the various places (i.e. README and landing page)
  • support debug_line_str and debug_rnglists in ape.lds

perhaps even both, but I'm pretty happy with just specifying the flag. plus, that'd fix potential issues with users still using the current release (2.0.1 as of writing).

also, I've attached hello.o produced by gcc 10 and gcc 11 if anyone would like to experiment with them without having to install anything. hello.tar.gz

notwa avatar Sep 12 '22 13:09 notwa

I've updated the README and added -gdwarf-4 to the website landing pages. I got the mono repo itself to build with GCC11. See config.mk for the link to the musl-cross-make toolchain I used.

I'm not sure what you're recommending with ape/ape.lds. The sed expressions didn't apply. Could you send a pull request changing what needs to be changed? Please preserve backwards compatibility. Also be sure to put Fixes #594 in the commit message.

Thanks!

jart avatar Sep 13 '22 11:09 jart

oops, it looks like I was running sed on the ape.lds file included with releases, instead of the file from the repo. the release has some of its whitespace trimmed. I'll make a note to simply post diffs in the future.

the changes to ape/ape.lds aren't strictly necessary when using the proposed flag (which you've already implemented, nice!). then again, copying these sections shouldn't hurt either. I'll make a PR once I check that it won't break anything obvious.

notwa avatar Sep 13 '22 12:09 notwa

Yes thank you, I'd definitely like to include that information if it exists. People shouldn't have to run sed on the compiled linker script.

jart avatar Sep 13 '22 13:09 jart