llvm-project
llvm-project copied to clipboard
[llvm-debuginfo-analyzer] Weird output of `--compare`
Hello. Thank you for making this tool available for LLVM!
I downloaded + applied all patches uploaded in Phabricator and tried to follow the user guide, and I have some questions. I'm following "Comparison Mode" section of llvm-debuginfo-analyzer.rst
.
I built test.cpp
with both g++ and clang++ instructed in the section. This is test.cpp
, which is the same code in the instruction:
using INTPTR = const int *;
int foo(INTPTR ParamPtr, unsigned ParamUnsigned, bool ParamBool) {
if (ParamBool) {
typedef int INTEGER;
const INTEGER CONSTANT = 7;
return CONSTANT;
}
return ParamUnsigned;
}
This is how I built the object files:
$ g++ -c -g -O0 test.cpp -o test-dwarf-gcc.o
$ clang++ -c -g -O0 test.cpp -o test-dwarf-clang.o
These are my clang++ and g++ versions:
$ g++ --version
g++ (Debian 11.3.0-3) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ clang++ --version
clang version 14.0.6
Target: x86_64-unknown-linux-gnu
Thread model: posix
This is the analyzer's output on each of the object files, separately:
$ llvm-debuginfo-analyzer --attribute=level --print=symbols,types test-dwarf-gcc.o
Logical View:
[000] {File} 'test-dwarf-gcc.o'
[001] {CompileUnit} 'test.cpp'
[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
[002] 2 {Function} extern not_inlined 'foo' -> 'int'
[003] {Parameter} 'ParamBool' -> 'bool'
[003] {Parameter} 'ParamPtr' -> 'INTPTR'
[003] {Parameter} 'ParamUnsigned' -> 'unsigned int'
[003] {Block}
[004] 4 {TypeAlias} 'INTEGER' -> 'int'
[004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
$ llvm-debuginfo-analyzer --attribute=level --print=symbols,types test-dwarf-clang.o
Logical View:
[000] {File} 'test-dwarf-clang.o'
[001] {CompileUnit} 'test.cpp'
[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
[002] 2 {Function} extern not_inlined 'foo' -> 'int'
[003] {Block}
[004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
[003] 2 {Parameter} 'ParamBool' -> 'bool'
[003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
[003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
[003] 4 {TypeAlias} 'INTEGER' -> 'int'
It looks gcc lacks debug info compared to clang. So the output here is little different from what the instruction expects, but I think that's fine and this is not what I'd like to ask about.
Then I ran the comparison mode as instructed in "Logical View" section:
$ llvm-debuginfo-analyzer --attribute=level --compare=types --report=view --print=symbols,types test-dwarf-clang.o test-dwarf-gcc.o
Reference: 'test-dwarf-clang.o'
Target: 'test-dwarf-gcc.o'
Logical View:
[000] {File} 'test-dwarf-clang.o'
[001] {CompileUnit} 'test.cpp'
-[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
-[002] 2 {Function} extern not_inlined 'foo' -> 'int'
[003] {Block}
[004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
[003] 2 {Parameter} 'ParamBool' -> 'bool'
[003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
[003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
-[003] 4 {TypeAlias} 'INTEGER' -> 'int'
+[002] 2 {Function} extern not_inlined 'foo' -> 'int'
[003] {Parameter} 'ParamBool' -> 'bool'
[003] {Parameter} 'ParamPtr' -> 'INTPTR'
[003] {Parameter} 'ParamUnsigned' -> 'unsigned int'
[003] {Block}
+[004] 4 {TypeAlias} 'INTEGER' -> 'int'
[004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
+[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
This looks weird. Both files have [002] 2 {Function} extern not_inlined 'foo' -> 'int'
line but somehow it is prepended with -
in one line and +
with another. Also test-dwarf-clang.o
has more debug info on {Parameter}
lines, but they are not listed as differences, i.e., not prepended with +
or -
.
And then I ran the commands in "Logical Elements" section:
$ llvm-debuginfo-analyzer --attribute=level --compare=types --report=list --print=symbols,types,summary test-dwarf-clang.o test-dwarf-gcc.o
Reference: 'test-dwarf-clang.o'
Target: 'test-dwarf-gcc.o'
(1) Missing Scopes:
-[002] 2 {Function} extern not_inlined 'foo' -> 'int'
(2) Missing Types:
-[003] 4 {TypeAlias} 'INTEGER' -> 'int'
-[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
(1) Added Scopes:
+[002] 2 {Function} extern not_inlined 'foo' -> 'int'
(2) Added Types:
+[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
+[004] 4 {TypeAlias} 'INTEGER' -> 'int'
----------------------------------------
Element Expected Missing Added
----------------------------------------
Scopes 4 1 1
Symbols 0 0 0
Types 2 2 2
Lines 0 0 0
----------------------------------------
Total 6 3 3
It's very confusing here. As I noted, [002] 2 {Function} extern not_inlined 'foo' -> 'int'
line is not different in the two files, but it is listed in both "Missing Scopes" and "Added Scopes" section. Also [002] 1 {TypeAlias} 'INTPTR' -> '* const int'
line also occurs in the both files but listed in both "Missing Types" and "Added Types" section.
A few more unrelated misc. questions: I'd like to know how high/low the debug info coverage is and any gaps or missing debug info, and I tried --attribute=coverage,gaps
, but not sure if I am doing it correctly.
llvm-debuginfo-analyzer --attribute=coverage,gaps --print=all test.o
I tried something like this but I couldn't find the relevant info. It's possible that I just don't know how to use the tool. Is this a correct way of getting this information?
Thank you! cc @CarlosAlbertoEnciso
@llvm/issue-subscribers-debuginfo
@aheejin Thanks very much for taking the time to try the llvm-debuginfo-analyzer
tool.
In order to debug the issue with the comparison
$ llvm-debuginfo-analyzer --attribute=level --compare=types --report=view --print=symbols,types test-dwarf-clang.o test-dwarf-gcc.o
it would be useful if you can attach the test-dwarf-clang.o test-dwarf-gcc.o
object files.
Few points from looking at the logical views:
[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
Appears in both logical views with the same attributes (level, name, type), but it is marked as added/missing. Which is incorrect.
[003] {Parameter} 'ParamBool' -> 'bool'
[003] {Parameter} 'ParamPtr' -> 'INTPTR'
[003] {Parameter} 'ParamUnsigned' -> 'unsigned int'
[003] 2 {Parameter} 'ParamBool' -> 'bool'
[003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
[003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
For 2 functions to be logically the same, all of its children must be logically the same.
GCC is not generating line numbers for the {Parameter}
elements.
[003] {Block}
[004] 4 {TypeAlias} 'INTEGER' -> 'int'
[004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
[003] {Block}
[004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
The {Block}
element does not have the same number of children.
[004] 4 {TypeAlias} 'INTEGER' -> 'int'
[003] 4 {TypeAlias} 'INTEGER' -> 'int'
The {TypeAlias}
elements are located at different lexical scopes: 4
and 3
.
Also test-dwarf-clang.o has more debug info on {Parameter} lines, but they are not listed as differences, i.e., not prepended with + or -.
You are raising a very interesting point. For the comparison criteria, those {Parameter}
elements are different as the line numbers are different. They should be marked as added/missing.
It's very confusing here. As I noted, [002] 2 {Function} extern not_inlined 'foo' -> 'int' line is not different in the two files, but it is listed in both "Missing Scopes" and "Added Scopes" section. Also [002] 1 {TypeAlias} 'INTPTR' -> '* const int' line also occurs in the both files but listed in both "Missing Types" and "Added Types" section.
As I stated in my previous comment, this seems incorrect.
-[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
+[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
The comparison is done twice:
reference: test-dwarf-clang.o
target: test-dwarf-gcc.o
It means the target (test-dwarf-gcc.o)
is missing these elements.
(1) Missing Scopes:
-[002] 2 {Function} extern not_inlined 'foo' -> 'int'
(2) Missing Types:
-[003] 4 {TypeAlias} 'INTEGER' -> 'int'
reference: test-dwarf-gcc.o
target: test-dwarf-clang.o
It means the target (test-dwarf-clang.o)
is adding these elements.
(1) Added Scopes:
+[002] 2 {Function} extern not_inlined 'foo' -> 'int'
(2) Added Types:
+[004] 4 {TypeAlias} 'INTEGER' -> 'int'
A few more unrelated misc. questions: I'd like to know how high/low the debug info coverage is and any gaps or missing debug info, and I tried --attribute=coverage,gaps, but not sure if I am doing it correctly.
llvm-debuginfo-analyzer --attribute=coverage,gaps --print=all test.o
Your command line is correct. When a more low level detail is needed, the following attributes are quite useful:
--coverage: Symbol location coverage.
--gaps: Missing debug location (gaps).
--location: Symbol debug location.
--range: Debug location ranges.
--register: Processor register names.
--offset: Debug information offset.
Using test-dwarf-clang.o
from the documentation:
llvm-debuginfo-analyzer.exe --attribute=level,coverage,gaps,range,offset --print=symbols test-dwarf-clang.o
Logical View:
[0x0000000000][000] {File} 'test-dwarf-clang.o'
[0x000000000b][001] {CompileUnit} 'test.cpp'
[0x000000000b][002] {Range} Lines 2:9 [0x0000000000:0x000000003a]
[0x000000002a][002] 2 {Function} extern not_inlined 'foo' -> [0x0000000099]'int'
[0x000000002a][003] {Range} Lines 2:9 [0x0000000000:0x000000003a]
[0x0000000071][003] {Block}
[0x0000000071][004] {Range} Lines 5:8 [0x000000001c:0x000000002f]
[0x000000007e][004] 5 {Variable} 'CONSTANT' -> [0x00000000c3]'const INTEGER'
[0x000000007e][005] {Coverage} 100.00%
[0x000000007f][005] {Location}
[0x000000007f][006] {Entry} fbreg -28
[0x0000000063][003] 2 {Parameter} 'ParamBool' -> [0x00000000bc]'bool'
[0x0000000063][004] {Coverage} 100.00%
[0x0000000064][004] {Location}
[0x0000000064][005] {Entry} fbreg -21
[0x0000000047][003] 2 {Parameter} 'ParamPtr' -> [0x00000000a0]'INTPTR'
[0x0000000047][004] {Coverage} 100.00%
[0x0000000048][004] {Location}
[0x0000000048][005] {Entry} fbreg -16
[0x0000000055][003] 2 {Parameter} 'ParamUnsigned' -> [0x00000000b5]'unsigned int'
[0x0000000055][004] {Coverage} 100.00%
[0x0000000056][004] {Location}
[0x0000000056][005] {Entry} fbreg -20
Note: The --location
attribute is set if any of --coverage
, --gaps
or --register
is specified.
Oh, I meant to attach the files and forgot it. I zipped test-dwarf-clang.o and test-dwarf-gcc.o and attached it here. (Github apparently doesn't support .o
files)
Thank you for checking my report!
-
So do you mean the things I reported (repeating lines and weird missing/added types) are indeed bugs in the tool?
-
This snippet is from your last comment. What does "Coverage" here mean? What's coverage is 100% here?
[0x0000000055][003] 2 {Parameter} 'ParamUnsigned' -> [0x00000000b5]'unsigned int'
[0x0000000055][004] {Coverage} 100.00%
[0x0000000056][004] {Location}
[0x0000000056][005] {Entry} fbreg -20
-
I tried to use
--gaps
, but it doesn't seem to generate any new info. Any hints? -
Do you have any plan to support Wasm binaries?
Oh, I meant to attach the files and forgot it. I zipped test-dwarf-clang.o and test-dwarf-gcc.o and attached it here. (Github apparently doesn't support
.o
files)
Thanks for the attached object files.
1. So do you mean the things I reported (repeating lines and weird missing/added types) are indeed bugs in the tool?
There are confirmed issues:
a) Missing line number on the {Parameters}
, due to an incorrect handling of a DWARF attribute form.
[003] {Parameter} 'ParamBool' -> 'bool'
[003] {Parameter} 'ParamPtr' -> 'INTPTR'
[003] {Parameter} 'ParamUnsigned' -> 'unsigned int'
b) These {TypeAlias}
should not be tagged as added/missing
-[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
+[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
c) The {Function}
element not being discovered.
{Function} extern not_inlined 'foo' -> 'int'
2. This snippet is from your last comment. What does "Coverage" here mean? What's coverage is 100% here?
[0x0000000055][003] 2 {Parameter} 'ParamUnsigned' -> [0x00000000b5]'unsigned int' [0x0000000055][004] {Coverage} 100.00% [0x0000000056][004] {Location} [0x0000000056][005] {Entry} fbreg -20
{Coverage}
gives an indication about the debug information quality, which will be reflected when debugging optimized code. For this specific case, it means that when debugging ParamUnsigned
is visible for the whole duration of its parent scope.
The {Location}
gives the debug location. In this case {Entry} fbreg -20
; the parameter is stored at offset -20 from the DW_AT_frame_base attribute of its containing function - which is the base of the frame for the function.
3. I tried to use `--gaps`, but it doesn't seem to generate any new info. Any hints?
In general the --gaps
option would produce output for optimized builds. What I would suggest is to use the tool on a bigger object file.
This is a logical view snippet for Opensteer (https://github.com/meshula/OpenSteer) used for testing:
https://github.com/meshula/OpenSteer/blob/710432341807f5597b62fa168f194f9ef2640c8e/src/PolylineSegmentedPathwaySingleRadius.cpp#L90
90 OpenSteer::PolylineSegmentedPathwaySingleRadius&
91 OpenSteer::PolylineSegmentedPathwaySingleRadius::operator=( PolylineSegmentedPathwaySingleRadius other )
92 {
93 swap( other );
94 return *this;
95 }
[002] {Source} 'src/polylinesegmentedpathwaysingleradius.cpp'
[002] 91 {Function} extern not_inlined 'operator=' -> '& PolylineSegmentedPathwaySingleRadius'
[003] {Range} Lines 92:94 [0x0000025370:0x0000025393]
[003] {Parameter} 'this' -> '* PolylineSegmentedPathwaySingleRadius'
[004] {Coverage} 100.00%
[004] {Location}
[005] {Entry} fbreg -8
[003] 91 {Parameter} 'other' -> 'OpenSteer::PolylineSegmentedPathwaySingleRadius'
[004] {Coverage} 71.43%
[004] {Location} Lines 92:94 [0x0000025370:0x0000025389]
[005] {Entry} breg4+0 RSI+0
[004] {Location} Lines 94:94 [0x000002538a:0x0000025393]
[005] {Entry} missing
It shows:
{Parameter} 'other', {Coverage} 71.43%
{Location} Lines 92:94 [0x0000025370:0x0000025389]
breg4+0 RSI+0
{Location} Lines 94:94 [0x000002538a:0x0000025393]
missing
Basically, other
is visible in the debugger only between the address range [0x0000025370:0x0000025389]
There is a gap
between the address range [0x000002538a:0x0000025393]
Its parent scope 'operator='
covers the address range [0x0000025370:0x0000025393]
4. Do you have any plan to support Wasm binaries?
There are no plans to support Wasm.
@aheejin Using the test cases from https://github.com/llvm/llvm-project/issues/57040#issuecomment-1211329454:
The GCC version generates
debug_abbrev contents:
DW_TAG_formal_parameter DW_CHILDREN_no
...
DW_AT_decl_file DW_FORM_implicit_const 1
DW_AT_decl_line DW_FORM_implicit_const 2
DW_TAG_typedef DW_CHILDREN_no
...
DW_AT_decl_file DW_FORM_implicit_const 1
DW_AT_decl_line DW_FORM_data1
The Clang version generates
DW_TAG_formal_parameter DW_CHILDREN_no
...
DW_AT_decl_file DW_FORM_data1
DW_AT_decl_line DW_FORM_data1
DW_TAG_typedef DW_CHILDREN_no
...
DW_AT_decl_file DW_FORM_data1
DW_AT_decl_line DW_FORM_data1
The llvm-debuginfo-analyzer
does not support the DW_FORM_implicit_const
, causing the {Parameter}
elements to have no line declaration data. I will upload a patch to add such support.
To determine if 2 LVElement are the same, the comparison module uses the line number, filename, level, name, qualified name and type attributes.
In the case of the object generated by GCC g++ (Debian 11.3.0-3) 11.3.0
, as the .debug_line is DWARF5, the llvm-debuginfo-analyzer (ELF reader)
is expecting the file index to be 0-indexed and it does an internal adjustment, which causes the
comparison mismatch, as their file indexes are outside the line table boundaries.
The ELF reader uses the .debug_line DWARF version to decide if the internal file index requires any adjustment, as it expects them to be 1-indexed.
For the following test case: test.cpp
void foo(void ParamPtr) { }
* GCC (GNU C++17 11.3.0) - All DW_AT_decl_file use index 1.
.debug_info:
format = DWARF32, version = 0x0005
DW_TAG_compile_unit
DW_AT_name ("test.cpp")
DW_TAG_subprogram ("foo")
DW_AT_decl_file (1)
DW_TAG_formal_parameter ("ParamPtr")
DW_AT_decl_file (1)
.debug_line:
Line table prologue: format (DWARF32), version (5)
include_directories[0] = "..."
file_names[0]: name ("test.cpp"), dir_index (0)
file_names[1]: name ("test.cpp"), dir_index (0)
Additional discussions here: https://www.mail-archive.com/[email protected]/msg00883.html
@aheejin I have uploaded a new series of patches that fixes the issues described:
-
{Parameter}
missing line information. - Incorrect
{Function}
matching.
You can download a single patch that combines all individual patches: https://reviews.llvm.org/D126875
Happy to help with any additional questions or issues.
Thank you for the fixes! (Sorry for the delayed response)
@aheejin Created a patch that supports wasm binaries: https://github.com/llvm/llvm-project/pull/82588