jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8312132: Add tracking of multiple address spaces in NMT

Open jdksjolen opened this issue 11 months ago • 38 comments

Hi,

This PR introduces a new abstraction to NMT, named MemoryFileTracker. Today, NMT does not track any memory outside of the virtual memory address space. This means that if you allocated memory in something such as a memory-backed file and use mmap to map into that memory, then you'll have trouble reporting this to NMT. This is the situation that ZGC is in, and that is what this patch attempts to fix.

MemoryFileTracker

The MemoryFileTracker adds the ability of adding new virtual memory address spaces to NMT and committing memory to these, the basic API is:

    static MemoryFile* make_device(const char* descriptive_name);
    static void free_device(MemoryFile* device);

    static void allocate_memory(MemoryFile* device, size_t offset, size_t size,
                                MEMFLAGS flag, const NativeCallStack& stack);
    static void free_memory(MemoryFile* device, size_t offset, size_t size);

It is easiest to see how this is used by looking at what ZGC's ZNMT class does:

void ZNMT::reserve(zaddress_unsafe start, size_t size) {
  MemTracker::record_virtual_memory_reserve((address)start, size, CALLER_PC, mtJavaHeap);
}
void ZNMT::commit(zoffset offset, size_t size) {
  MemTracker::allocate_memory_in(ZNMT::_device, static_cast<size_t>(offset), size, mtJavaHeap, CALLER_PC);
}
void ZNMT::uncommit(zoffset offset, size_t size) {
  MemTracker::free_memory_in(ZNMT::_device, (size_t)offset, size);
}

void ZNMT::map(zaddress_unsafe addr, size_t size, zoffset offset) {
  // NMT doesn't track mappings at the moment.
}
void ZNMT::unmap(zaddress_unsafe addr, size_t size) {
  // NMT doesn't track mappings at the moment.
}

As you can see, any mapping between reserved regions and device-allocated memory is not recorded in NMT. This means that in detailed mode you only get reserved regions printed for the reserved memory, the device-allocated memory is reported separately. When performing summary reporting any memory allocated via these devices is added to the corresponding MEMFLAGS as committed memory.

This patch is also acting as a base on which we deploy multiple new backend ideas to NMT. These ideas are:

  1. Implement VMA tracking using a balanced binary tree approach. Today's VirtualMemoryTracker's usage of linked lists is slow and brittle, we'd like to move away from it. Our Treap-based approach in this patch gives a performance boost such that we see 25x better performance in a benchmark. The idea and draft of this is due to Thomas Stüfe, kudos to him. My approach uses a few recursive functions which aren't in tail-call position, these reaches a call-stack depth equivalent to the depth of the tree. With a tree of 20 million nodes we will have a depth of log2(20 * 10**6) = 25, which is perfectly fine.
  2. Separate storage of the NativeCallStacks in a closed addressing hashtable. NCS:s are large and there are relatively few of them occurring many times, so caching them makes sense. NCS:s are never deleted and the bucket array is never resized.

Example detailed output

We add a section Memory file details when producing a detailed report.


Memory file details

Memory map of ZGC heap backing file

[0x0000000000000000 - 0x0000000065200000] allocated 1696595968 for Java Heap
    [0x00007f3ecbdb8aea]ZPhysicalMemoryManager::commit(ZPhysicalMemory&)+0x9a
    [0x00007f3ecbdb211b]ZPageAllocator::commit_page(ZPage*)+0x33
    [0x00007f3ecbdb2ad5]ZPageAllocator::alloc_page_finalize(ZPageAllocation*)+0x7b
    [0x00007f3ecbdb2c1b]ZPageAllocator::alloc_page(ZPageType, unsigned long, ZAllocationFlags, ZPageAge)+0xc7


Missing features

There is no detailed diffing of this data.


Progress

  • [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8312132: Add tracking of multiple address spaces in NMT (Enhancement - P4)

Reviewers

Contributors

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18289/head:pull/18289
$ git checkout pull/18289

Update a local copy of the PR:
$ git checkout pull/18289
$ git pull https://git.openjdk.org/jdk.git pull/18289/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 18289

View PR using the GUI difftool:
$ git pr show -t 18289

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18289.diff

Webrev

Link to Webrev Comment

jdksjolen avatar Mar 13 '24 21:03 jdksjolen

:wave: Welcome back jsjolen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Mar 13 '24 21:03 bridgekeeper[bot]

@jdksjolen This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8312132: Add tracking of multiple address spaces in NMT

Co-authored-by: Thomas Stuefe <[email protected]>
Reviewed-by: stefank, stuefe

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 100 new commits pushed to the master branch:

  • 244f6ac222fa98fba4fb99bf5bccd36e3e6c5de1: 8307824: Clean up Finalizable.java and finalize terminology in vmTestbase/nsk/share
  • a706e35b12addff987b489059be8f240c60fae75: 8332039: Cannot invoke "com.sun.source.util.DocTreePath.getTreePath()" because "path" is null
  • 612b6896d28cebf61ef024709ff3afb5e3ee0dde: 8043226: Better diagnostics for non-applicable type annotations
  • dce97031555dcf689fecda16e444e7e8e9d5b270: 8333226: Regressions 2-3% in Compress ZGC after 8331253
  • b101dcb609eae00b406f387cd90e58487d5868df: 8333312: Incorrect since tags on new ClassReader and ConstantPool methods
  • e0bab786402d70e9a74d1816c029c772ea01f697: 8326951: since-checker - missing @ since tags
  • 31f70391e5f22ff5803d16b52c1e1248b6253d8c: 8316131: runtime/cds/appcds/TestParallelGCWithCDS.java fails with JNI error
  • 4a1cdd5ba947ffc88c1100966e68826eb35ed441: 8333486: Parallel: Remove unused methods in psParallelCompact
  • 664c993c41753843293388a6ff1481a94a5b4c22: 8331731: ubsan: relocInfo.cpp:155:30: runtime error: applying non-zero offset to null pointer
  • 8d3de45f4dfd60dc4e2f210cb0c085fcf6efb8e2: 8325168: JShell should support Markdown comments
  • ... and 90 more: https://git.openjdk.org/jdk/compare/43a2f17342af8f5bf1f5823df9fa0bf0bdfdfce2...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Mar 13 '24 21:03 openjdk[bot]

@jdksjolen The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Mar 13 '24 21:03 openjdk[bot]

Webrevs

mlbridge[bot] avatar Mar 20 '24 15:03 mlbridge[bot]

/contributor add tstuefe

jdksjolen avatar Mar 22 '24 16:03 jdksjolen

@jdksjolen tstuefe was not found in the census.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

  • /contributor add @openjdk-bot
  • /contributor add duke
  • /contributor add J. Duke <[email protected]>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

openjdk[bot] avatar Mar 22 '24 16:03 openjdk[bot]

/contributor add @tstuefe

jdksjolen avatar Mar 25 '24 11:03 jdksjolen

Or maybe /contributor add @stuefe

jdksjolen avatar Mar 25 '24 11:03 jdksjolen

@jdksjolen Contributor Thomas Stuefe <[email protected]> successfully added.

openjdk[bot] avatar Mar 25 '24 11:03 openjdk[bot]

@jdksjolen @stuefe was not found in the census.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

  • /contributor add @openjdk-bot
  • /contributor add duke
  • /contributor add J. Duke <[email protected]>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

openjdk[bot] avatar Mar 25 '24 11:03 openjdk[bot]

@stefank, thank you for the style change suggestions. They've now been applied.

jdksjolen avatar Mar 26 '24 11:03 jdksjolen

⚠️ @jdksjolen This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

openjdk[bot] avatar Mar 26 '24 13:03 openjdk[bot]

OK, now it should be fixed. Sorry about that.

jdksjolen avatar Mar 26 '24 13:03 jdksjolen

I've updated the code to have data on both sides, but now a bunch of my tests are failing. So, I'll have to do a bunch of work to understand why those are now failing. I'm still performing merging, however.

jdksjolen avatar Mar 26 '24 15:03 jdksjolen

Right, so the tests pass again but it's not clear to me that having the metadata state in both directions really can be considered necessary/more elegant. The merging is cleaned up a bit.

I had a go at removing VMATree as a friend class of the Treap, which unfortunately made the build break. Still working on that. I need to regain some clarity on the necessity of merging. I think that we can change os::commit_memory() such that it requires a MEMFLAGS, in which case we can get rid of this merging. That will be some work. I could change the PR such that the API requires a MEMFLAGS for all operations.

Thoughts?

I'm off on Easter vacation, will be back on Tuesday.

jdksjolen avatar Mar 26 '24 19:03 jdksjolen

Right, the refactoring to remove the friend declaration has completely fumbled the code. I'll probably force a revert on this to the state before that or do a git bisect to find the bugs. Right now the code is basically borked.

Last good hash: 7445999ee296872320f91146e1004026ba1133c7

jdksjolen avatar Apr 05 '24 15:04 jdksjolen

Right, the refactoring to remove the friend declaration has completely fumbled the code. I'll probably force a revert on this to the state before that or do a git bisect to find the bugs. Right now the code is basically borked.

Last good hash: 7445999

God, sorry. Do as you think is best.

I plan to look at this PR, but probably it will not be this week.

Love your commit messages btw.

tstuefe avatar Apr 09 '24 13:04 tstuefe

Right, the refactoring to remove the friend declaration has completely fumbled the code. I'll probably force a revert on this to the state before that or do a git bisect to find the bugs. Right now the code is basically borked. Last good hash: 7445999

God, sorry. Do as you think is best.

I plan to look at this PR, but probably it will not be this week.

Love your commit messages btw.

Nah, it's alright. It was literally that my getter right() returned _left, this was of course impossible for me to see after staring at the code for too long. Fixing that did lead to a good commit message, I'm happy you appreciate them :-).

jdksjolen avatar Apr 09 '24 14:04 jdksjolen

I decided on the more generic IntervalChange, IntervalState and StateType names. This is naming, I'm perfectly fine switching it to something else.

jdksjolen avatar Apr 09 '24 14:04 jdksjolen

I've removed the merge() functionality as I'm expecting Afshin's PR to get approved which will render adapting flags unnecessary. I did the most obvious size optimization such that each in/out state only takes 16 bytes instead of 24.

jdksjolen avatar Apr 15 '24 21:04 jdksjolen

@tstuefe ,

I'll ping you (like I just did) in the comments when I've got a new version for the Treap stuff.

jdksjolen avatar Apr 17 '24 08:04 jdksjolen

@jdksjolen Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

openjdk[bot] avatar Apr 19 '24 12:04 openjdk[bot]

Windows debug builds failing with following linker errors:

 === Output from failing command(s) repeated here ===
* For target hotspot_variant-server_libjvm_gtest_objs_BUILD_GTEST_LIBJVM_link:
   Creating library d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\gtest\objs\jvm.lib and object d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\gtest\objs\jvm.exp
jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayCHeap<struct `public: struct VMATree::SummaryDiff __cdecl VMATree::register_mapping(unsigned __int64,unsigned __int64,enum VMATree::StateType,struct VMATree::Metadata &)'::`2'::AddressState,12>::`vftable'" (??_7?$GrowableArrayCHeap@UAddressState@?1??register_mapping@VMATree@@QEAA?AUSummaryDiff@3@_K0W4StateType@3@AEAUMetadata@3@@Z@$0M@@@6B@)
jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayView<struct `public: struct VMATree::SummaryDiff __cdecl VMATree::register_mapping(unsigned __int64,unsigned __int64,enum VMATree::StateType,struct VMATree::Metadata &)'::`2'::AddressState>::`vftable'" (??_7?$GrowableArrayView@UAddressState@?1??register_mapping@VMATree@@QEAA?AUSummaryDiff@3@_K0W4StateType@3@AEAUMetadata@3@@Z@@@6B@)
jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayWithAllocator<struct `public: struct VMATree::SummaryDiff __cdecl VMATree::register_mapping(unsigned __int64,unsigned __int64,enum VMATree::StateType,struct VMATree::Metadata &)'::`2'::AddressState,class GrowableArrayCHeap<struct `public: struct VMATree::SummaryDiff __cdecl VMATree::register_mapping(unsigned __int64,unsigned __int64,enum VMATree::StateType,struct VMATree::Metadata &)'::`2'::AddressState,12> >::`vftable'" (??_7?$GrowableArrayWithAllocator@UAddressState@?1??register_mapping@VMATree@@QEAA?AUSummaryDiff@3@_K0W4StateType@3@AEAUMetadata@3@@Z@V?$GrowableArrayCHeap@UAddressState@?1??register_mapping@VMATree@@QEAA?AUSummaryDiff@3@_K0W4StateType@3@AEAUMetadata@3@@Z@$0M@@@@@6B@)
d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\gtest\jvm.dll : fatal error LNK1120: 3 unresolved externals
* For target hotspot_variant-server_libjvm_objs_BUILD_LIBJVM_link:
   Creating library d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\objs\jvm.lib and object d:\a\jdk\jdk\build\windows-aarch64\hotspot\variant-server\libjvm\objs\jvm.exp
jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayCHeap<struct `public: struct VMATree::SummaryDiff __cdecl VMATree::register_mapping(unsigned __int64,unsigned __int64,enum VMATree::StateType,struct VMATree::Metadata &)'::`2'::AddressState,12>::`vftable'" (??_7?$GrowableArrayCHeap@UAddressState@?1??register_mapping@VMATree@@QEAA?AUSummaryDiff@3@_K0W4StateType@3@AEAUMetadata@3@@Z@$0M@@@6B@)
jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayView<struct `public: struct VMATree::SummaryDiff __cdecl VMATree::register_mapping(unsigned __int64,unsigned __int64,enum VMATree::StateType,struct VMATree::Metadata &)'::`2'::AddressState>::`vftable'" (??_7?$GrowableArrayView@UAddressState@?1??register_mapping@VMATree@@QEAA?AUSummaryDiff@3@_K0W4StateType@3@AEAUMetadata@3@@Z@@@6B@)
jvm.exp : error LNK2001: unresolved external symbol "const GrowableArrayWithAllocator<struct `public: struct VMATree::SummaryDiff __cdecl VMATree::register_mapping(unsigned __int64,unsigned __int64,enum VMATree::StateType,struct VMATree::Metadata &)'::`2'::AddressState,class GrowableArrayCHeap<struct `public: struct VMATree::SummaryDiff __cdecl VMATree::register_mapping(unsigned __int64,unsigned __int64,enum VMATree::StateType,struct VMATree::Metadata &)'::`2'::AddressState,12> >::`vftable'" (??_7?$GrowableArrayWithAllocator@UAddressState@?1??register_mapping@VMATree@@QEAA?AUSummaryDiff@3@_K0W4StateType@3@AEAUMetadata@3@@Z@V?$GrowableArrayCHeap@UAddressState@?1??register_mapping@VMATree@@QEAA?AUSummaryDiff@3@_K0W4StateType@3@AEAUMetadata@3@@Z@$0M@@@@@6B@)
d:\a\jdk\jdk\build\windows-aarch64\support\modules_libs\java.base\server\jvm.dll : fatal error LNK1120: 3 unresolved externals

jdksjolen avatar Apr 21 '24 11:04 jdksjolen

Fixed a bug that Afshin found w.r.t. summary accounting. The double-arrow thing really helped out there, so I'm happy with keeping that.

jdksjolen avatar Apr 22 '24 14:04 jdksjolen

Taking a look...

gerard-ziemski avatar Apr 22 '24 21:04 gerard-ziemski

Hi @tstuefe,

Cleaned up Treap significantly, it looks way better now! Thanks for the ideas.

Good. Will try to take a look on Thursday or Friday.

tstuefe avatar Apr 23 '24 05:04 tstuefe

@jdksjolen Are you going to update the code in response to Afshin feedback sometime soon?

I find it a bit hard to look at the code tagged with so many comments, so if you are thinking about updating it sometime soon, I'd prefer to wait reviewing it.

gerard-ziemski avatar Apr 24 '24 14:04 gerard-ziemski

First pass through working through the most of Afshin's comments. Thanks!

jdksjolen avatar Apr 25 '24 10:04 jdksjolen

@jdksjolen Are you going to update the code in response to Afshin feedback sometime soon?

I find it a bit hard to look at the code tagged with so many comments, so if you are thinking about updating it sometime soon, I'd prefer to wait reviewing it.

Hi Gerard,

Go to "Files changed", scroll to a comment and press "i", all comments should now be hidden.

jdksjolen avatar Apr 26 '24 11:04 jdksjolen

I will take a closer look later today, and also clean out some of my obsolete comments. GH interface is really not made for complex patches.

tstuefe avatar Apr 30 '24 06:04 tstuefe