jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8293422: DWARF emitted by Clang cannot be parsed

Open chhagedorn opened this issue 2 years ago • 15 comments

The DWARF debugging symbols emitted by Clang is different from what GCC is emitting. While GCC produces a complete .debug_aranges section (which is required in the DWARF parser), Clang does not. As a result, the DWARF parser cannot find the necessary information to proceed and create the line number information:

The .debug_aranges section contains address range to compilation unit offset mappings. The parsing algorithm can just walk through all these entries to find the correct address range that contains the library offset of the current pc. This gives us the compilation unit offset into the .debug_info section from where we can proceed to parse the line number information.

Without a complete .debug_aranges section, we fail with an assertion that we could not find the correct entry. Since JDK-8293402, we will still get the complete stack trace at least. Nevertheless, we should still fix this assertion failure of course. But that would require a different parsing approach. We need to parse the entire .debug_info section instead to get to the correct compilation unit. This, however, would require a lot more work.

I therefore suggest to disable DWARF parsing for Clang for now and file an RFE to support Clang in the future with a different parsing approach. I'm using the __clang__ ifdef to bail out in get_source_info() and disable the gtests. I've noticed that we are currently running the gtests with NOT PRODUCT which I think is not necessary - the gtests should also work fine with product builds. I've corrected this as well but that could also be done separately.

Thanks, Christian


Progress

  • [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10287/head:pull/10287
$ git checkout pull/10287

Update a local copy of the PR:
$ git checkout pull/10287
$ git pull https://git.openjdk.org/jdk pull/10287/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10287

View PR using the GUI difftool:
$ git pr show -t 10287

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10287.diff

chhagedorn avatar Sep 15 '22 11:09 chhagedorn

:wave: Welcome back chagedorn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Sep 15 '22 11:09 bridgekeeper[bot]

@chhagedorn The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Sep 15 '22 12:09 openjdk[bot]

From https://discourse.llvm.org/t/clang-does-not-produce-full-debug-aranges-section-with-thinlto/64898/8, one needs to use -gdwarf-aranges with clang. However I am not sure whether we will hit that bug with "thin LTO" mentioned there (or others) if we use it.

tschatzl avatar Sep 19 '22 07:09 tschatzl

Thanks Thomas for that link. I was not aware of this -gdwarf-aranges flag. I've tried it out and it indeed seems to work. But I was not able to build with -flto=thin as it resulted in build failures. So, I'm not sure what would happen and if it's even possible to build with it in general. Nevertheless, I suggest to go with that -gdwarf-aranges flag solution for now and remove the previously suggested bailout fix for Clang.

However, -gdwarf-aranges (and -gdwarf-4 which I think we should also add to avoid getting the unsupported DWARF 5 format) was only added in Clang 5.0. But we must support down to 3.5 according to: https://github.com/openjdk/jdk/blob/cb72f80925965c73e32c44ce3196866272306d7f/doc/building.md?plain=1#L353-L354

I therefore changed the previous complete bailout fix to a bailout fix for Clang versions older than 5.0.

I've noticed that Clang is emitting a full relative path for the filename in the form of src/hotspot/share/compiler/compilerThread.cpp:58 with debug builds (it only emits the filename with release builds). I therefore added an additional method strip_path_prefix() to get rid of the path prefix.

chhagedorn avatar Sep 21 '22 07:09 chhagedorn

Thanks Thomas for that link. I was not aware of this -gdwarf-aranges flag. I've tried it out and it indeed seems to work. But I was not able to build with -flto=thin as it resulted in build failures.

I only thought that maybe we do compile with -flto=thin already, and so we could not use these flags.

So, I'm not sure what would happen and if it's even possible to build with it in general. Nevertheless, I suggest to go with that -gdwarf-aranges flag solution for now and remove the previously suggested bailout fix for Clang.

I agree.

However, -gdwarf-aranges (and -gdwarf-4 which I think we should also add to avoid getting the unsupported DWARF 5 format) was only added in Clang 5.0. But we must support down to 3.5 according to:

https://github.com/openjdk/jdk/blob/cb72f80925965c73e32c44ce3196866272306d7f/doc/building.md?plain=1#L353-L354

I therefore changed the previous complete bailout fix to a bailout fix for Clang versions older than 5.0.

I've noticed that Clang is emitting a full relative path for the filename in the form of src/hotspot/share/compiler/compilerThread.cpp:58 with debug builds (it only emits the filename with release builds). I therefore added an additional method strip_path_prefix() to get rid of the path prefix.

Okay.

tschatzl avatar Sep 21 '22 10:09 tschatzl

@chhagedorn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8293422: DWARF emitted by Clang cannot be parsed

Reviewed-by: tschatzl, ihse, stuefe

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 6 new commits pushed to the master branch:

  • 59d8f67a7100c371405e80194498d8e425addf8e: 8297265: G1: Remove unnecessary null-check in RebuildCodeRootClosure::do_code_blob
  • 2fc340a7030e895c264c39fc8690af108a6ad921: 8148041: Test java/awt/Mouse/TitleBarDoubleClick/TitleBarDoubleClick fails on Ubuntu with mouseReleased event after double click on title bar
  • e4206618ac82222f8f61e348cfa68db0d708fe90: 8297238: RISC-V: C2: Use Matcher::vector_element_basic_type when checking for vector element type in predicate
  • 891c706a103042043f5ef6fcf56720ccbcfc7e19: 8295276: AArch64: Add backend support for half float conversion intrinsics
  • 3c0949824e06f2b3d44f1bde9d2292a7627b0197: 8297241: Update sun/java2d/DirectX/OnScreenRenderingResizeTest/OnScreenRenderingResizeTest.java
  • 45d1807ad3248805f32b1b94b02ac368e0d6bcc0: 6312651: Compiler should only use verified interface types for optimization

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch. As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Sep 22 '22 11:09 openjdk[bot]

Thanks Thomas for your review!

chhagedorn avatar Oct 11 '22 11:10 chhagedorn

/cc build

vnkozlov avatar Oct 11 '22 16:10 vnkozlov

@vnkozlov The build label was successfully added.

openjdk[bot] avatar Oct 11 '22 16:10 openjdk[bot]

We currently require clang 3.5 (when building on linux I presume; Xcode has their very own peculiar way of versioning clang). Maybe this is too old? Should we instead raise the bar to require clang 5.0, so we can skip the test?

magicus avatar Oct 12 '22 11:10 magicus

Yeah, clang 5.0 was released in 2017. I'd recommend you just hard-code in the dwarf flags and instead apply this:

diff --git a/make/autoconf/toolchain.m4 b/make/autoconf/toolchain.m4
index adb4e182dcc..e1c097916d8 100644
--- a/make/autoconf/toolchain.m4
+++ b/make/autoconf/toolchain.m4
@@ -50,7 +50,7 @@ TOOLCHAIN_DESCRIPTION_microsoft="Microsoft Visual Studio"
 TOOLCHAIN_DESCRIPTION_xlc="IBM XL C/C++"
 
 # Minimum supported versions, empty means unspecified
-TOOLCHAIN_MINIMUM_VERSION_clang="3.5"
+TOOLCHAIN_MINIMUM_VERSION_clang="5.0"
 TOOLCHAIN_MINIMUM_VERSION_gcc="6.0"
 TOOLCHAIN_MINIMUM_VERSION_microsoft="19.28.0.0" # VS2019 16.8, aka MSVC 14.28
 TOOLCHAIN_MINIMUM_VERSION_xlc=""

magicus avatar Oct 12 '22 11:10 magicus

That would be an option to bump that minimum version to get rid of the ifdefs for clang 5.0. But since this has quite an impact, are we allowed to just do that in this change or is it required to have a separate task for that with approvals first? We would also need to change the docs to reflect that change etc.

chhagedorn avatar Oct 12 '22 13:10 chhagedorn

If is is a big change or not depends on how it affects builds on macos. I assume that clang functionality that was published in 2017 is already incorporated in the minimum supported version of Xcode on mac. (This needs to be verified, though)

For clang on linux; Oracle do not regularly build linux with clang, nor do our GHA build scripts. I don't know if anyone regularly tests this.

My guess is that the 3.5 limit was put in place when the clang for linux support was added, and it has not been modified since.

But you are probably right that it should be a separate PR, if not for anything else so to get proper attention to it. Do you want me to publish such a PR first, or do you want to continue with the current conditionals, and clean them up afterwards if/when we go to clang 5.0+?

magicus avatar Oct 12 '22 14:10 magicus

If is is a big change or not depends on how it affects builds on macos. I assume that clang functionality that was published in 2017 is already incorporated in the minimum supported version of Xcode on mac. (This needs to be verified, though)

For clang on linux; Oracle do not regularly build linux with clang, nor do our GHA build scripts. I don't know if anyone regularly tests this.

My guess is that the 3.5 limit was put in place when the clang for linux support was added, and it has not been modified since.

Okay, I see, thanks for the summary!

But you are probably right that it should be a separate PR, if not for anything else so to get proper attention to it.

Yes, it's probably better to get more attention for this update.

Do you want me to publish such a PR first, or do you want to continue with the current conditionals, and clean them up afterwards if/when we go to clang 5.0+?

I think both is fine. But since I already have the patch ready, I suggest to move forward with it and then come back later to clean it up.

chhagedorn avatar Oct 13 '22 07:10 chhagedorn

Thanks Magnus for your review of the build changes!

May a get a second review of the DWARF parser code changes?

Thanks, Christian

chhagedorn avatar Oct 18 '22 15:10 chhagedorn

@chhagedorn This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Nov 15 '22 17:11 bridgekeeper[bot]

@chhagedorn Are you waiting for an additional Hotspot review?

magicus avatar Nov 16 '22 08:11 magicus

Yes, I think it would be good to get a second review of the DWARF parser changes. Maybe @tstuefe?

chhagedorn avatar Nov 16 '22 08:11 chhagedorn

Yes, I think it would be good to get a second review of the DWARF parser changes. Maybe @tstuefe?

I'm a bit swamped, but try to take a look later today.

tstuefe avatar Nov 16 '22 08:11 tstuefe

Thank you Thomas! Take your time, there is no hurry.

chhagedorn avatar Nov 16 '22 09:11 chhagedorn

I've pushed an update and changed the algorithm in the following way to address @tstuefe review comments:

  • Just read and ignore the filenames from DWARF which do not correspond to the one we are looking for (i.e. when current_index != file_index).
  • Reading the filename of interest (i.e. when current_index == file_index):
    • Read single chars and stop once the null terminator is found.
    • Reset buffer when file separator is found to skip the prefix path.
    • On filename buffer overflow: Keep reading, we could still be reading a path prefix and reset the buffer again when finding a file separator.
    • If filename does not fit into the provided buffer, use a generic overflow message <OVERFLOW>. If that does not fit either, use the minimal filename L. This allows to at least have the source information L:line_no which almost always is already enough to get to the actual source code location. Doing it in this way lets the parser succeed instead of failing.

I've added some additional tests for the overflow scenarios and enforced get_source_info() to only accept buffers with a length of at least 2 to always allow the minimal filename L.

Submitted some testing again in our CI (results look good so far) and additionally by running gtests with Clang slowdebug, fastdebug and release builds.

I've done some additional manual testing, both with GCC and Clang builds. I've also played around by changing the buffer size in vmError.cpp. It works as expected:

  • buffer size 15, emitting <OVERFLOW> for filenames being too long:
V  [libjvm.so+0x8905fc]  CompileWrapper::~CompileWrapper()+0x56  (compile.cpp:492)
V  [libjvm.so+0x8921e6]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1652  (compile.cpp:864)
V  [libjvm.so+0x77d171]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x179  (c2compiler.cpp:113)
V  [libjvm.so+0x8b0fc4]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x8e8  (<OVERFLOW>:2237)
V  [libjvm.so+0x8afc3d]  CompileBroker::compiler_thread_loop()+0x3ed  (<OVERFLOW>:1916)
V  [libjvm.so+0x8d047c]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x72  (<OVERFLOW>:58)
V  [libjvm.so+0xc63342]  JavaThread::thread_main_inner()+0x144  (javaThread.cpp:699)
V  [libjvm.so+0xc631fa]  JavaThread::run()+0x182  (javaThread.cpp:684)
V  [libjvm.so+0x1337633]  Thread::call_run()+0x195  (thread.cpp:224)
V  [libjvm.so+0x10e38d7]  thread_native_entry(Thread*)+0x19b  (os_linux.cpp:710)
  • buffer size 2, emitting L for all filenames (being too long):
V  [libjvm.so+0x8905fc]  CompileWrapper::~CompileWrapper()+0x56  (compile.cpp:492)
V  [libjvm.so+0x8921e6]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1652  (L:864)
V  [libjvm.so+0x77d171]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x179  (L:113)
V  [libjvm.so+0x8b0fc4]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x8e8  (L:2237)
V  [libjvm.so+0x8afc3d]  CompileBroker::compiler_thread_loop()+0x3ed  (L:1916)
V  [libjvm.so+0x8d047c]  CompilerThread::thread_entry(JavaThread*, JavaThread*)+0x72  (L:58)
V  [libjvm.so+0xc6339c]  JavaThread::thread_main_inner()+0x144  (L:699)
V  [libjvm.so+0xc63254]  JavaThread::run()+0x182  (L:684)
V  [libjvm.so+0x1337693]  Thread::call_run()+0x195  (L:224)
V  [libjvm.so+0x10e3937]  thread_native_entry(Thread*)+0x19b  (L:710)

chhagedorn avatar Nov 18 '22 15:11 chhagedorn

Thanks Thomas for your review and your feedback! Testing looked good.

@tschatzl do you also agree with the new approach?

chhagedorn avatar Nov 21 '22 07:11 chhagedorn

@chhagedorn this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8293422
git fetch https://git.openjdk.org/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

openjdk[bot] avatar Nov 21 '22 08:11 openjdk[bot]

Thanks @tschatzl for reviewing it again!

/integrate

chhagedorn avatar Nov 21 '22 12:11 chhagedorn

Going to push as commit 8b8d8481bc05eec70a1df832668322e5c17694d8. Since your change was applied there have been 6 commits pushed to the master branch:

  • 59d8f67a7100c371405e80194498d8e425addf8e: 8297265: G1: Remove unnecessary null-check in RebuildCodeRootClosure::do_code_blob
  • 2fc340a7030e895c264c39fc8690af108a6ad921: 8148041: Test java/awt/Mouse/TitleBarDoubleClick/TitleBarDoubleClick fails on Ubuntu with mouseReleased event after double click on title bar
  • e4206618ac82222f8f61e348cfa68db0d708fe90: 8297238: RISC-V: C2: Use Matcher::vector_element_basic_type when checking for vector element type in predicate
  • 891c706a103042043f5ef6fcf56720ccbcfc7e19: 8295276: AArch64: Add backend support for half float conversion intrinsics
  • 3c0949824e06f2b3d44f1bde9d2292a7627b0197: 8297241: Update sun/java2d/DirectX/OnScreenRenderingResizeTest/OnScreenRenderingResizeTest.java
  • 45d1807ad3248805f32b1b94b02ac368e0d6bcc0: 6312651: Compiler should only use verified interface types for optimization

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar Nov 21 '22 12:11 openjdk[bot]

@chhagedorn Pushed as commit 8b8d8481bc05eec70a1df832668322e5c17694d8.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar Nov 21 '22 12:11 openjdk[bot]