jdk 8291714: Implement a Multi-Reader Single-Writer mutex for Hotspot

May I please have a review for this PR which implements a MRWMutex class for Hotspot?

This PR does 3 things:

Adds a port of ZGC's MRSW mutex (see [0]) to Hotspot
Adds some new utilities for writing multi-threaded tests.
Adds some tests for MRSW Mutex using these new utilities

The ticket has some comments which might be worth checking out: https://bugs.openjdk.org/browse/JDK-8291714

[0] Original source code here: https://github.com/openjdk/zgc/blob/zgc_generational/src/hotspot/share/gc/z/zJNICritical.cpp

Progress

[ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
[x] Change must not contain extraneous whitespace
[x] Commit message must refer to an issue

Issue

JDK-8291714: Implement a Multi-Reader Single-Writer mutex for Hotspot

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9838/head:pull/9838
$ git checkout pull/9838

Update a local copy of the PR:
$ git checkout pull/9838
$ git pull https://git.openjdk.org/jdk pull/9838/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 9838

View PR using the GUI difftool:
$ git pr show -t 9838

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9838.diff

Aug 11 '22 15:08 jdksjolen

:wave: Welcome back jdksjolen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Aug 11 '22 15:08 bridgekeeper[bot]

/label add hotspot

Aug 11 '22 15:08 jdksjolen

@jdksjolen The following label will be automatically applied to this pull request:

hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

Aug 11 '22 15:08 openjdk[bot]

@jdksjolen The hotspot label was successfully added.

Aug 11 '22 15:08 openjdk[bot]

Webrevs

Aug 11 '22 15:08 mlbridge[bot]

Multi-Reader Single-Writer mutex

Sorry to bikeshed but such a thing is called a ReadWriteLock or a ReadersWriterLock or something similar - can we please use familiar nomenclature here (even rwlock as per POSIX). "mutex" is a contraction of mutual-exclusion and as readers are not mutually exclusive it is an oxymoron to talk about a "multi-reader mutex". Thanks.

Aug 12 '22 04:08 dholmes-ora

64 bit cmpxchg isn't available for some 32 bit CPUs. Maybe it could be int32_t?

Right. AFAICS only ARMv6 would be affected by this - does anyone still build for that? (grep for SUPPORTS_NATIVE_CX8). But a 32-bit count should be more than adequate.

Aug 12 '22 05:08 dholmes-ora

Forgot to mention the code in the header file seems rather involved for a .hpp file - should these be in a .inline.hpp file instead?

Also did you establish this RW-lock will in fact be usable in this form for UL?

Aug 12 '22 05:08 dholmes-ora

Also did you establish this RW-lock will in fact be usable in this form for UL?

I have rewritten the relevant code (LogOutputList class) using this RW-lock and it can run the VM in both async and synch mode. It passes all the tests for the class, and I've sent off a Mach5 job for it. I have read the code and it seems reasonable that this should work.

Aug 12 '22 13:08 jdksjolen

Multi-Reader Single-Writer mutex

Sorry to bikeshed but such a thing is called a ReadWriteLock or a ReadersWriterLock or something similar - can we please use familiar nomenclature here (even rwlock as per POSIX). "mutex" is a contraction of mutual-exclusion and as readers are not mutually exclusive it is an oxymoron to talk about a "multi-reader mutex". Thanks.

For what it's worth, C++ Standard Library uses "mutex" to refer to an object that gets locked, and uses "lock" to refer to an RAII object that locks a "mutex". So it has, for example, std::shared_mutex, which is multi-reader/single-writer. The idea being that a "mutex" provides some kind of mutual exclusion - in the case of shared_mutex readers (multiple) are excluded from any writer, and writers are exclusive from each other. I'd prefer that to calling these new things "locks".

Aug 16 '22 18:08 kimbarrett

HotSpot Mutex has a rank, used (in debug builds) to detect potential deadlock situations. These new mutexes don't have that feature, and don't interact with Mutex in that way. The rank system has pros and cons, but I'm not sure just ignoring here is a good thing.

Aug 16 '22 19:08 kimbarrett

What is the source of the algorithm being used here? Is there a correctness proof? How does it compare with, for example, Terekhov's algorithm?

Aug 16 '22 20:08 kimbarrett

HotSpot Mutex has a rank,

True, but I would put the read/write lock in the same basket as Semaphore, being completely independent of Mutex/Monitor. The onus is on the user to use them with care. A shared ranking mechanism would be tricky I think, and read locks don't fit well with ranking schemes.

Aug 17 '22 13:08 dholmes-ora

How does it compare with, for example, Terekhov's algorithm?

I'm not familiar with this algorithm, do you have a link?

What is the source of the algorithm being used here?

I'll have to defer to @fisk for this question.

Aug 17 '22 13:08 jdksjolen

For what it's worth, C++ Standard Library uses "mutex" to refer to an object that gets locked, and uses "lock" to refer to an RAII object that locks a "mutex".

So their "lock" would be our MutexLocker? The far more dominant terminology is that a mutex is a lock.

Aug 17 '22 13:08 dholmes-ora

How does it compare with, for example, Terekhov's algorithm?

I'm not familiar with this algorithm, do you have a link?

One place I found it. https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2406.html#shared_mutex search for "shared_mutex Reference Implementation"

What's proposed by this change has some similarities. This moves some state manipulation out from under the lock by using atomic operations on the state. It only has one condvar, possibly due to the limitations of our portability layer. Terekhov's algorithm is fair in the sense that neither readers nor writers get preferential treatment; it defers to the OS to decide scheduling.

Aug 18 '22 14:08 kimbarrett

For what it's worth, C++ Standard Library uses "mutex" to refer to an object that gets locked, and uses "lock" to refer to an RAII object that locks a "mutex".

So their "lock" would be our MutexLocker? The far more dominant terminology is that a mutex is a lock.

Depends on which community. I think the C++ convention makes a lot of sense, esp. given the RAII approach to locking.

Aug 18 '22 14:08 kimbarrett

HotSpot Mutex has a rank,

True, but I would put the read/write lock in the same basket as Semaphore, being completely independent of Mutex/Monitor. The onus is on the user to use them with care. A shared ranking mechanism would be tricky I think, and read locks don't fit well with ranking schemes.

I would not put them in the same basket as semaphore. Semaphores are a signaling mechanism, with no concept of scope or (shared) ownership (though semaphores sometimes get conscripted into the implementation of such mechanisms).

I don't see any interesting difference between an unshared locking of a shared_mutex and locking an ordinary mutex. If ranks (or anything of a similar nature) make sense for the latter, then they make sense for the former and ought to be the same mechanism.

I agree that read locks with ranking could have problems. Though I wonder if "use with care" might solve most of those problems.

Aug 18 '22 14:08 kimbarrett

For some reason, I can't believe that we don't have this feature already in the vm, or at least, I was expecting this to be a simple wrapper over globalCounter.hpp. Or SingleWriterSynchronizer.hpp (whose name I can say). If it's the same as the ZGC version, maybe the ZGC version should be moved or reconciled in the utilities directory.

GlobalCounter and SingleWriterSynchronizer are RCU mechanisms. While there are some similarities in use-cases, RCU is not equivalent to a shared mutex. I think there are use-cases for shared mutex that aren't appropriate (or even feasible?) for RCU. My impression is this came up because UL has its own attempt at an RCU mechanism that is broken in both implementation and usage. (That's my 2nd-hand impression; I've not really looked at it.) One solution would be to change UL to use one of our other RCU mechanisms and fix any usage problems. Another is to introduce a shared mutex and use that. Generational ZGC has a shared mutex to deal with a specific use-case, and this PR is proposing to generalize that.

Aug 19 '22 00:08 kimbarrett

What's proposed by this change has some similarities. This moves some state manipulation out from under the lock by using atomic operations on the state. It only has one condvar, possibly due to the limitations of our portability layer. Terekhov's algorithm is fair in the sense that neither readers nor writers get preferential treatment; it defers to the OS to decide scheduling.

Yeah, these are very similar solutions, seems like the major difference is the granularity of information provided by a condvar, locking instead of using atomics, and bit twiddling instead of just using the sign of a signed number.

Aug 19 '22 08:08 jdksjolen

I just pushed my first clean up of this code based on your comments.

Here's a short summary:

Rename to ReadWriteLock
Move to utilities
Remove template, use a conditional branch + lambda for ThreadBlockInVM (painful way of doing it, stefank has a good idea with the nullptr strategy)
Clarify some comments
Remove nullptr case from Locker, as it's not used in this code anyway
Use int32_t instead of int64_t for counter

Some of your comments are still not handled. Off the top of my head, I'm still not documenting everything that dholmes is interested in. @robehn has mentioned to me that there are issues when a safepointed thread holds a lock that the VM thread wants to hold, so that needs to be handled (he has also pointed out how to handle this).

Kim mentioned that UL might want to use GlobalCounter instead. This is viable, but contention on the global counter may lead to performance degradation for other users, AFAIK.

Finally, I will be simplifying this PR by pulling out the additions to threadHelper.inline.hpp into its own PR.

Aug 19 '22 15:08 jdksjolen

Thank you for the additional review, @dholmes-ora . I fixed those and added some more documentation.

The changes to threadHelper.inline.hpp is being merged through a separate patch (see JDK-8292679, https://github.com/openjdk/jdk/pull/9962), and so are not in need of review.

Aug 24 '22 15:08 jdksjolen

I don't have time to review, just chiming in that I look forward to use RWL in NMT, to simplify the lockless MST.

Aug 25 '22 10:08 tstuefe

@tstuefe I have a memory of there being a lock like this in NMT, but it must have been removed.

If there's an upcoming use of this RWL lock, then we should integrate it first and not bundle it with the client code.

Aug 25 '22 12:08 coleenp

@tstuefe I have a memory of there being a lock like this in NMT, but it must have been removed.

There is a lock for queries. MST is lockless for normal operations, which makes the table a bit cumbersome and not ideally sorted (bucket chains sorted reverse from what would be good). Normally this is not a big problem since those chains are short.

If there's an upcoming use of this RWL lock, then we should integrate it first and not bundle it with the client code.

Certainly. I was not planning to do it right away, just voicing my support for this RWL idea.

I have to look again at this anyway, maybe I'm overlooking something. E.g. I'm not sure if the MST needs to be functional beyond VMs normal lifetime. Then it is better to leave it lockless.

Aug 25 '22 12:08 tstuefe

@jdksjolen Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information.

Aug 25 '22 13:08 openjdk-notifier[bot]

As no current review comments were alive I rebased onto master and force pushed. This reduces the PR a bit, as parts of it (additions to threadHelper.inline.hpp) was merged in another PR and as such could be removed.

Aug 25 '22 13:08 jdksjolen

@jdksjolen please do not ever force push in a live PR. There is no need, you can just do your merge locally and then push the merge changesets.

Aug 25 '22 22:08 dholmes-ora

Latest commit addresses most of Kim's review comments. It also makes the current thread optional, allowing it to be null. This is mainly useful if the lock is used very early in initialization.

Aug 29 '22 08:08 jdksjolen

I don't expect this to be a popular opinion, but the more I think about it the less I like the direct use of PlatformMonitor here rather than using Monitor. The "extra cruft" that comes with Monitor is there for reasons, and I'm not convinced that bypassing all that is good.

Sep 01 '22 21:09 kimbarrett

jdk jdk copied to clipboard

8291714: Implement a Multi-Reader Single-Writer mutex for Hotspot

Progress

Issue

Reviewing

Webrevs

jdk
jdk copied to clipboard