libbacktrace
libbacktrace copied to clipboard
Absolute source file paths are required for debuginfod to work
This is a different take on #72 and #26.
Excerpt from the DEBUGINFOD(8) manual page about the requirement, with some reasoning about it:
/buildid/BUILDID/source/SOURCE/FILE
If the given buildid is known to the server, this request will result in a binary object that
contains the source file mentioned. The path should be absolute. Relative path names commonly
appear in the DWARF file's source directory, but these paths are relative to individual compila‐
tion unit AT_comp_dir paths, and yet an executable is made up of multiple CUs. Therefore, to
disambiguate, debuginfod expects source queries to prefix relative path names with the CU compi‐
lation-directory, followed by a mandatory "/".
Note: the caller may or may not elide ../ or /./ or extraneous /// sorts of path components in
the directory names. debuginfod accepts both forms. Specifically, debuginfod canonicalizes
path names according to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing any // to /
in the path.
For example:
#include <stdio.h> /buildid/BUILDID/source/usr/include/stdio.h
/path/to/foo.c /buildid/BUILDID/source/path/to/foo.c
../bar/foo.c AT_comp_dir=/zoo/ /buildid/BUILDID/source/zoo//../bar/foo.c
The problem is that libbacktrace is currently providing paths such as: /usr/src/debug/curl-7.83.0/lib/transfer.c
or ../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c
. The first path, which is absolute, can be used with debuginfod without any problems. However, the second path, a relative one, does not specify an exact location, due to reasons discussed in the manual page above.
Inspecting the libraries with objdump gives the following results:
objdump -W /usr/lib/libcurl.so.4
(...)
<0><21a289>: Abbrev Number: 59 (DW_TAG_compile_unit)
<21a28a> DW_AT_producer : (indirect string, offset: 0x4ab3): GNU C17 11.2.0 -march=x86-64 -mtune=generic -g -O2 -fvisibility=hidden -fno-plt -fexceptions -fstack-clash-protection -fcf-protection=full -flto -fPIC
<21a28e> DW_AT_language : 29 (C11)
<21a28f> DW_AT_name : (indirect line string, offset: 0x1ad9): /usr/src/debug/curl-7.83.0/lib/transfer.c
<21a293> DW_AT_comp_dir : (indirect line string, offset: 0x383): /usr/src/debug/build-curl/lib
<21a297> DW_AT_stmt_list : 0x5fdff
(...)
In this case, DW_AT_name
is absolute, so DW_AT_comp_dir
value does not need to be used.
objdump -W /usr/lib/dri/iris_dri.so
(...)
<0><864dd0>: Abbrev Number: 2 (DW_TAG_compile_unit)
<864dd1> DW_AT_producer : (indirect string, offset: 0x17742): GNU C11 11.2.0 -march=x86-64 -mtune=generic -mtls-dialect=gnu -g -g1 -O0 -O2 -std=c11 -fvisibility=hidden -flto -fno-math-errno -fno-trapping-math -fno-common -ffunction-sections -fdata-sections -fno-plt -fexceptions -fstack-clash-protection -fcf-protection=full -flto -fPIC
<864dd5> DW_AT_language : 29 (C11)
<864dd6> DW_AT_name : (indirect line string, offset: 0x45fd): ../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c
<864dda> DW_AT_comp_dir : (indirect line string, offset: 0x0): /usr/src/debug/build
<864dde> DW_AT_stmt_list : 0x8446d9
(...)
However, in this second example, the DW_AT_name
is relative, so DW_AT_comp_dir
should be prepended to it, producing the following result: /usr/src/debug/build/../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c
.
Is this something you are interested in implementing?
Are you suggesting that libbacktrace should pass an absolute path if possible when calling the callback functions? That seems reasonable. I kind of thought that libbacktrace did that already. It certainly tries to do so. Maybe there are some cases that it misses. Can you show me a test case? Thanks.
Yes, absolute paths should be expected.
I have created a simple test application:
// g++ test.cpp -g ../libbacktrace/.libs/libbacktrace.a
#include <execinfo.h>
#include <stdio.h>
#include "../libbacktrace/backtrace.h"
int SymCb( void* data, uintptr_t pc, const char* fn, int lineno, const char* func )
{
printf( "%s: %s:%i\n", func, fn, lineno );
return 0;
}
void ErrCb( void* data, const char* msg, int errnum )
{
}
int main()
{
uintptr_t addr;
backtrace( (void**)&addr, 1 );
auto state = backtrace_create_state( nullptr, 0, nullptr, nullptr );
backtrace_pcinfo( state, addr, SymCb, ErrCb, nullptr );
}
Unfortunately, it is returning an absolute path:
main: /home/wolf/test/test.cpp:20
This is happening with a relative DW_AT_name
:
<12> DW_AT_name : (indirect line string, offset: 0x10): test.cpp
<16> DW_AT_comp_dir : (indirect line string, offset: 0x0): /home/wolf/test
There must be some deeper interaction which is preventing proper DW_AT_comp_dir
insertion in some currently unknown edge cases.
Ok, I have repro. The code is the same as above, but you have to build it in a slightly different way:
[1:07 wolf@mimir:~/test]% mkdir build
[1:07 wolf@mimir:~/test]% cd build
[1:07 wolf@mimir:~/test/build]% g++ ../test.cpp -g ../../libbacktrace/.libs/libbacktrace.a
[1:07 wolf@mimir:~/test/build]% ./a.out
main: ../test.cpp:22
<12> DW_AT_name : (indirect line string, offset: 0x17): ../test.cpp
<16> DW_AT_comp_dir : (indirect line string, offset: 0x0): /home/wolf/test/build
There are four places in dwarf.c
, where directories are concatenated with filenames. All can be found by searching for '/'
. Program flow reaches the one in read_lnct
. Deeper inspection shows that a different kind of records is used here. Looking at the data read in read_line_header
into hdr
you can see the following values:
For the case returning absolute values:
Process 171530 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
frame #0: 0x0000555555565351 a.out`read_line_header(state=0x00007ffff7ffa000, ddata=0x00007ffff7fb9890, u=0x00007ffff7ffae38, is_dwarf64=0, line_buf=0x00007fffffffe068, hdr=0x00007fffffffe1f0) at dwarf.c:3052:7
3049 return 0;
3050 }
3051
-> 3052 if (hdr_buf.reported_underflow)
3053 return 0;
3054
3055 return 1;
(lldb) p hdr->dirs_count
(size_t) $0 = 3
(lldb) p hdr->dirs[0]
(const char *) $1 = 0x00007ffff7fb616f "/home/wolf/test"
(lldb) p hdr->dirs[1]
(const char *) $2 = 0x00007ffff7fb6188 "/usr/include"
(lldb) p hdr->dirs[2]
(const char *) $3 = 0x00007ffff7fb6195 "../libbacktrace"
(lldb) p hdr->filenames_count
(size_t) $4 = 6
(lldb) p hdr->filenames[0]
(const char *) $5 = 0x00007ffff6764668 "/home/wolf/test/test.cpp"
(lldb) p hdr->filenames[1]
(const char *) $6 = 0x00007ffff6764688 "/home/wolf/test/test.cpp"
(lldb) p hdr->filenames[2]
(const char *) $7 = 0x00007ffff67646a8 "/usr/include/stdint.h"
(lldb) p hdr->filenames[3]
(const char *) $8 = 0x00007ffff67646c0 "../libbacktrace/backtrace.h"
(lldb) p hdr->filenames[4]
(const char *) $9 = 0x00007ffff67646e0 "/usr/include/execinfo.h"
(lldb) p hdr->filenames[5]
(const char *) $10 = 0x00007ffff67646f8 "/usr/include/stdio.h"
And for the case returning relative values:
(lldb) p hdr->dirs_count
(size_t) $0 = 4
(lldb) p hdr->dirs[0]
(const char *) $1 = 0x00007ffff7fb617f "/home/wolf/test/build"
(lldb) p hdr->dirs[1]
(const char *) $2 = 0x00007ffff7fb6195 ".."
(lldb) p hdr->dirs[2]
(const char *) $3 = 0x00007ffff7fb6198 "/usr/include"
(lldb) p hdr->dirs[3]
(const char *) $4 = 0x00007ffff7fb61a5 "../../libbacktrace"
(lldb) p hdr->filenames_count
(size_t) $5 = 6
(lldb) p hdr->filenames[0]
(const char *) $6 = 0x00007ffff6764668 "../test.cpp"
(lldb) p hdr->filenames[1]
(const char *) $7 = 0x00007ffff6764678 "../test.cpp"
(lldb) p hdr->filenames[2]
(const char *) $8 = 0x00007ffff6764688 "/usr/include/stdint.h"
(lldb) p hdr->filenames[3]
(const char *) $9 = 0x00007ffff67646a0 "../../libbacktrace/backtrace.h"
(lldb) p hdr->filenames[4]
(const char *) $10 = 0x00007ffff67646c0 "/usr/include/execinfo.h"
(lldb) p hdr->filenames[5]
(const char *) $11 = 0x00007ffff67646d8 "/usr/include/stdio.h"
In dwarf_lookup_pc
control reaches filename = ln->filename;
line, which sets the actual file name value returned by the function. The ln->filename
value is ../test.cpp
. This is the only relevant field in this structure. At the same time, there is the entry->u
data structure, which contains filename="../test.cpp"
and comp_dir="/home/wolf/test"
. The abs_filename
in this structure is nullptr
. Information from entry->u
is not used to fix the incomplete path data retrieved from ln
.