libbacktrace icon indicating copy to clipboard operation
libbacktrace copied to clipboard

Absolute source file paths are required for debuginfod to work

Open wolfpld opened this issue 2 years ago • 5 comments

This is a different take on #72 and #26.

Excerpt from the DEBUGINFOD(8) manual page about the requirement, with some reasoning about it:

   /buildid/BUILDID/source/SOURCE/FILE
       If  the  given  buildid is known to the server, this request will result in a binary object that
       contains the source file mentioned.  The path should be absolute.  Relative path names  commonly
       appear in the DWARF file's source directory, but these paths are relative to individual compila‐
       tion unit AT_comp_dir paths, and yet an executable is made up of multiple  CUs.   Therefore,  to
       disambiguate, debuginfod expects source queries to prefix relative path names with the CU compi‐
       lation-directory, followed by a mandatory "/".

       Note: the caller may or may not elide ../ or /./ or extraneous /// sorts of path  components  in
       the  directory  names.   debuginfod  accepts both forms.  Specifically, debuginfod canonicalizes
       path names according to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing any //  to  /
       in the path.

       For example:

       #include <stdio.h>               /buildid/BUILDID/source/usr/include/stdio.h
       /path/to/foo.c                   /buildid/BUILDID/source/path/to/foo.c
       ../bar/foo.c AT_comp_dir=/zoo/   /buildid/BUILDID/source/zoo//../bar/foo.c

The problem is that libbacktrace is currently providing paths such as: /usr/src/debug/curl-7.83.0/lib/transfer.c or ../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c. The first path, which is absolute, can be used with debuginfod without any problems. However, the second path, a relative one, does not specify an exact location, due to reasons discussed in the manual page above.

Inspecting the libraries with objdump gives the following results:

objdump -W /usr/lib/libcurl.so.4
(...)
 <0><21a289>: Abbrev Number: 59 (DW_TAG_compile_unit)
    <21a28a>   DW_AT_producer    : (indirect string, offset: 0x4ab3): GNU C17 11.2.0 -march=x86-64 -mtune=generic -g -O2 -fvisibility=hidden -fno-plt -fexceptions -fstack-clash-protection -fcf-protection=full -flto -fPIC
    <21a28e>   DW_AT_language    : 29   (C11)
    <21a28f>   DW_AT_name        : (indirect line string, offset: 0x1ad9): /usr/src/debug/curl-7.83.0/lib/transfer.c
    <21a293>   DW_AT_comp_dir    : (indirect line string, offset: 0x383): /usr/src/debug/build-curl/lib
    <21a297>   DW_AT_stmt_list   : 0x5fdff
(...)

In this case, DW_AT_name is absolute, so DW_AT_comp_dir value does not need to be used.

objdump -W /usr/lib/dri/iris_dri.so
(...)
<0><864dd0>: Abbrev Number: 2 (DW_TAG_compile_unit)
    <864dd1>   DW_AT_producer    : (indirect string, offset: 0x17742): GNU C11 11.2.0 -march=x86-64 -mtune=generic -mtls-dialect=gnu -g -g1 -O0 -O2 -std=c11 -fvisibility=hidden -flto -fno-math-errno -fno-trapping-math -fno-common -ffunction-sections -fdata-sections -fno-plt -fexceptions -fstack-clash-protection -fcf-protection=full -flto -fPIC
    <864dd5>   DW_AT_language    : 29   (C11)
    <864dd6>   DW_AT_name        : (indirect line string, offset: 0x45fd): ../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c
    <864dda>   DW_AT_comp_dir    : (indirect line string, offset: 0x0): /usr/src/debug/build
    <864dde>   DW_AT_stmt_list   : 0x8446d9
(...)

However, in this second example, the DW_AT_name is relative, so DW_AT_comp_dir should be prepended to it, producing the following result: /usr/src/debug/build/../mesa-22.0.2/src/mesa/state_tracker/st_atom_viewport.c.

Is this something you are interested in implementing?

wolfpld avatar May 01 '22 21:05 wolfpld

Are you suggesting that libbacktrace should pass an absolute path if possible when calling the callback functions? That seems reasonable. I kind of thought that libbacktrace did that already. It certainly tries to do so. Maybe there are some cases that it misses. Can you show me a test case? Thanks.

ianlancetaylor avatar May 02 '22 22:05 ianlancetaylor

Yes, absolute paths should be expected.

I have created a simple test application:

// g++ test.cpp -g ../libbacktrace/.libs/libbacktrace.a

#include <execinfo.h>
#include <stdio.h>
#include "../libbacktrace/backtrace.h"

int SymCb( void* data, uintptr_t pc, const char* fn, int lineno, const char* func )
{
    printf( "%s: %s:%i\n", func, fn, lineno );
    return 0;
}

void ErrCb( void* data, const char* msg, int errnum )
{
}

int main()
{
    uintptr_t addr;
    backtrace( (void**)&addr, 1 );

    auto state = backtrace_create_state( nullptr, 0, nullptr, nullptr );
    backtrace_pcinfo( state, addr, SymCb, ErrCb, nullptr );
}

Unfortunately, it is returning an absolute path: main: /home/wolf/test/test.cpp:20

This is happening with a relative DW_AT_name:

    <12>   DW_AT_name        : (indirect line string, offset: 0x10): test.cpp
    <16>   DW_AT_comp_dir    : (indirect line string, offset: 0x0): /home/wolf/test

There must be some deeper interaction which is preventing proper DW_AT_comp_dir insertion in some currently unknown edge cases.

wolfpld avatar May 02 '22 22:05 wolfpld

Ok, I have repro. The code is the same as above, but you have to build it in a slightly different way:

[1:07 wolf@mimir:~/test]% mkdir build
[1:07 wolf@mimir:~/test]% cd build
[1:07 wolf@mimir:~/test/build]% g++ ../test.cpp -g ../../libbacktrace/.libs/libbacktrace.a
[1:07 wolf@mimir:~/test/build]% ./a.out
main: ../test.cpp:22
    <12>   DW_AT_name        : (indirect line string, offset: 0x17): ../test.cpp
    <16>   DW_AT_comp_dir    : (indirect line string, offset: 0x0): /home/wolf/test/build

wolfpld avatar May 02 '22 23:05 wolfpld

There are four places in dwarf.c, where directories are concatenated with filenames. All can be found by searching for '/'. Program flow reaches the one in read_lnct. Deeper inspection shows that a different kind of records is used here. Looking at the data read in read_line_header into hdr you can see the following values:

For the case returning absolute values:

Process 171530 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000555555565351 a.out`read_line_header(state=0x00007ffff7ffa000, ddata=0x00007ffff7fb9890, u=0x00007ffff7ffae38, is_dwarf64=0, line_buf=0x00007fffffffe068, hdr=0x00007fffffffe1f0) at dwarf.c:3052:7
   3049         return 0;
   3050     }
   3051
-> 3052   if (hdr_buf.reported_underflow)
   3053     return 0;
   3054
   3055   return 1;
(lldb) p hdr->dirs_count
(size_t) $0 = 3
(lldb) p hdr->dirs[0]
(const char *) $1 = 0x00007ffff7fb616f "/home/wolf/test"
(lldb) p hdr->dirs[1]
(const char *) $2 = 0x00007ffff7fb6188 "/usr/include"
(lldb) p hdr->dirs[2]
(const char *) $3 = 0x00007ffff7fb6195 "../libbacktrace"
(lldb) p hdr->filenames_count
(size_t) $4 = 6
(lldb) p hdr->filenames[0]
(const char *) $5 = 0x00007ffff6764668 "/home/wolf/test/test.cpp"
(lldb) p hdr->filenames[1]
(const char *) $6 = 0x00007ffff6764688 "/home/wolf/test/test.cpp"
(lldb) p hdr->filenames[2]
(const char *) $7 = 0x00007ffff67646a8 "/usr/include/stdint.h"
(lldb) p hdr->filenames[3]
(const char *) $8 = 0x00007ffff67646c0 "../libbacktrace/backtrace.h"
(lldb) p hdr->filenames[4]
(const char *) $9 = 0x00007ffff67646e0 "/usr/include/execinfo.h"
(lldb) p hdr->filenames[5]
(const char *) $10 = 0x00007ffff67646f8 "/usr/include/stdio.h"

And for the case returning relative values:

(lldb) p hdr->dirs_count
(size_t) $0 = 4
(lldb) p hdr->dirs[0]
(const char *) $1 = 0x00007ffff7fb617f "/home/wolf/test/build"
(lldb) p hdr->dirs[1]
(const char *) $2 = 0x00007ffff7fb6195 ".."
(lldb) p hdr->dirs[2]
(const char *) $3 = 0x00007ffff7fb6198 "/usr/include"
(lldb) p hdr->dirs[3]
(const char *) $4 = 0x00007ffff7fb61a5 "../../libbacktrace"
(lldb) p hdr->filenames_count
(size_t) $5 = 6
(lldb) p hdr->filenames[0]
(const char *) $6 = 0x00007ffff6764668 "../test.cpp"
(lldb) p hdr->filenames[1]
(const char *) $7 = 0x00007ffff6764678 "../test.cpp"
(lldb) p hdr->filenames[2]
(const char *) $8 = 0x00007ffff6764688 "/usr/include/stdint.h"
(lldb) p hdr->filenames[3]
(const char *) $9 = 0x00007ffff67646a0 "../../libbacktrace/backtrace.h"
(lldb) p hdr->filenames[4]
(const char *) $10 = 0x00007ffff67646c0 "/usr/include/execinfo.h"
(lldb) p hdr->filenames[5]
(const char *) $11 = 0x00007ffff67646d8 "/usr/include/stdio.h"

wolfpld avatar May 03 '22 14:05 wolfpld

In dwarf_lookup_pc control reaches filename = ln->filename; line, which sets the actual file name value returned by the function. The ln->filename value is ../test.cpp. This is the only relevant field in this structure. At the same time, there is the entry->u data structure, which contains filename="../test.cpp" and comp_dir="/home/wolf/test". The abs_filename in this structure is nullptr. Information from entry->u is not used to fix the incomplete path data retrieved from ln.

wolfpld avatar Aug 16 '22 21:08 wolfpld