monero icon indicating copy to clipboard operation
monero copied to clipboard

cryptonote_core: introducing rwlock mechanism

Open 0xFFFC0000 opened this issue 1 year ago • 38 comments

Draft : 9173

This is a low-level explanation of design decisions. For simple use, just use RWLOCK and RLOCK macros which take care of all the details.

RecursiveReadWriteLock Design

Since we are recursively calling read/write operations, we needed a new locking mechanism to take care of that. All the design decisions are exactly the same as canonical ReadWrite locks, with a few extra decisions to make it fit other needs. Here are general ideas about this lock:

  1. Writes and Reads are exclusive. Meaning if there are read(s) going on, acquiring write will block.
  2. You can have as many concurrent reads as UINT32_MAX.
  3. Once you have the write lock you can start read operation(s) inside it too. Meaning a write transaction can have read transaction(s). But read transactions, cannot have write unless all reads are released the lock.
  4. RWLOCK and RLOCK macros are provided which abstracts all the details. Generally, you only need to use these macros and should not need to touch low-level locking API.
  5. RWLOCK and RLOCK are valid for the scope.

API Design

As I mentioned write transaction can include a read transaction too. For example, this is a low-level example of acquiring and releasing the lock:

E.g.

// Next line will acquire the write lock
// and will return true, meaning you are required to call end_write.
bool rw_release_required = m_blockchain.start_write();
.
.
    // Next line will acquire the read lock
    // and will return false, meaning you are not required to call end_read.
    bool r_release_required = m_blockchain.start_read();
    .
    .
    .
    if (r_release_required) m_blockchain.end_read();
.
.
if (rw_release_required) m_blockchain.end_write();

Or if we use macros to take care of this:

{
    // Write lock will be acquired next line.
    // and released when the scope ends.
    RWLOCK(m_blockchain);
    .
    .
    {
        // Read lock will be acquired next line,
        // and released when the scope ends.
        RLOCK(m_blockchain);
        .
        .
    }

}

Another example of recursive reads:

{
    // Read lock will be acquired next line.
    // and released when the scope ends.
    RLOCK(m_blockchain);
    .
    .
    {
        // Read lock will be acquired next line,
        // and released when the scope ends.
        RLOCK(m_blockchain);
        .
        .
    }

}

Another example of recursive writes:

{
    // Write lock will be acquired next line.
    // and released when the scope ends.
    RWLOCK(m_blockchain);
    .
    .
    {
        // Write lock will be acquired next line,
        RWLOCK(m_blockchain);
        .
        .
    }

}

Testing

This PR contains a testing suite for the lock. The test runs 4 different cases with 10, 50, 100 and 1000 threads, and each one has two iterations. In each test, a random number of writers and readers will start, and each reader and writer will run a random number of cycles. Each cycle is a wait and randomly (~20%) decide to recurse or not.

0xFFFC0000 avatar Feb 19 '24 10:02 0xFFFC0000

I think you answered in IRC already, but why is this needed? The LMDB class already has proper locking, so what is this doing?

https://github.com/monero-project/monero/issues/9193#issuecomment-1960024902

And if not, maybe the LMDB class needs to be updated instead of adding yet another layer of locking ontop of the existing locks.

I believe if you look at the code carefully, you would realize it does not add another layer. It replaces an old locking mechanism which does not support read-write with newer.

0xFFFC0000 avatar Feb 29 '24 20:02 0xFFFC0000

I believe if you look at the code carefully, you would realize it does not add another layer. It replaces an old locking mechanism which does not support read-write with newer.

Yes, I skimmed the code way too quickly. But still, same question, any idea what these mutexes are protecting?

vtnerd avatar Mar 01 '24 13:03 vtnerd

I believe if you look at the code carefully, you would realize it does not add another layer. It replaces an old locking mechanism which does not support read-write with newer.

Yes, I skimmed the code way too quickly. But still, same question, any idea what these mutexes are protecting?

Provides reader-writer locking for Blockchain object itself in blockchain.cpp.

0xFFFC0000 avatar Mar 01 '24 13:03 0xFFFC0000

Provides reader-writer locking for Blockchain object itself in blockchain.cpp.

That object shouldn't need protection, unless the pointer itself changes (like with init). It has locking internally where needed.

It looks like a reader lock isn't needed at all for these other functions; I don't think this is helping anything and is just further cluttering the code.

vtnerd avatar Mar 01 '24 14:03 vtnerd

Provides reader-writer locking for Blockchain object itself in blockchain.cpp.

That object shouldn't need protection, unless the pointer itself changes (like with init). It has locking internally where needed.

It looks like a reader lock isn't needed at all for these other functions; I don't think this is helping anything and is just further cluttering the code.

No. We still need locking. For example take a value like this [1].

moneromoooo: If there is no locking but the db txn, monerod could start adding two blocks at once, and this would get written by two threads. moneromoooo: The main good a rw lock gives us is allowing various RPC requests to come and be served in parallel. moneromoooo: Currently, they're serialized. moneromoooo: So the lock is (still) needed, and the conversion to rw is an optimization.

Thanks to @moneromooo-monero for his detailed explanation.

Update: link was fixed, should point to m_timestamps_and_difficulties_height.

  1. https://github.com/monero-project/monero/blob/7b7958bbd9d76375c47dc418b4adabba0f0b1785/src/cryptonote_core/blockchain.h#L1170

0xFFFC0000 avatar Mar 01 '24 15:03 0xFFFC0000

As extra optimization: There few methods that we can remove locking for example. Which I have to do. For example bool Blockchain::have_tx_keyimg_as_spent(const crypto::key_image &key_im). Which caller itself should take care of locking if he/she wants consistency. I am going to update those 5,6 functions. But overall extra locking in those functions does not have any impact on correctness, so not directly related to our discussion.

Update: Done. Details at [1].

  1. https://github.com/monero-project/monero/pull/9181#issuecomment-1974702927

0xFFFC0000 avatar Mar 01 '24 15:03 0xFFFC0000

I returned those 8 functions to unlock status as @moneromooo-monero mentioned in our private discussion.

Explanation: for these functions that we don't prove consistency, if the caller wants consistency between calls, he/she has to use locking himself/herself.

I left the original warning message:

  // WARNING: this function does not take m_blockchain_lock, and thus should only call read only
  // m_db functions which do not depend on one another (ie, no getheight + gethash(height-1), as
  // well as not accessing class members, even read only (ie, m_invalid_blocks). The caller must
  // lock if it is otherwise needed.

List

  1. bool Blockchain::have_tx(const crypto::hash &id) const [1]
  2. bool Blockchain::have_tx_keyimg_as_spent(const crypto::key_image &key_im) const [2]
  3. uint64_t Blockchain::get_current_blockchain_height() const [3]
  4. crypto::hash Blockchain::get_tail_id() const [4]
  5. crypto::hash Blockchain::get_block_id_by_height(uint64_t height) const [5]
  6. difficulty_type Blockchain::block_difficulty(uint64_t i) const [6]
  7. bool Blockchain::have_block_unlocked(const crypto::hash& id, int *where) const [7]
  8. size_t Blockchain::get_total_transactions() const [8]

Ref:

  1. https://github.com/monero-project/monero/compare/70ce45ae01f9ae68b98a21869782e3e200c2e790..21b68cd92740eccc41848b918d51df535c3b5925#diff-3dcea545b20bc1145f2766e64ca91d4e547d6072462970fd67830e1be6cdf793R114
  2. https://github.com/monero-project/monero/compare/70ce45ae01f9ae68b98a21869782e3e200c2e790..21b68cd92740eccc41848b918d51df535c3b5925#diff-3dcea545b20bc1145f2766e64ca91d4e547d6072462970fd67830e1be6cdf793R124
  3. https://github.com/monero-project/monero/compare/70ce45ae01f9ae68b98a21869782e3e200c2e790..21b68cd92740eccc41848b918d51df535c3b5925#diff-3dcea545b20bc1145f2766e64ca91d4e547d6072462970fd67830e1be6cdf793R270
  4. https://github.com/monero-project/monero/compare/70ce45ae01f9ae68b98a21869782e3e200c2e790..21b68cd92740eccc41848b918d51df535c3b5925#diff-3dcea545b20bc1145f2766e64ca91d4e547d6072462970fd67830e1be6cdf793R707
  5. https://github.com/monero-project/monero/compare/70ce45ae01f9ae68b98a21869782e3e200c2e790..21b68cd92740eccc41848b918d51df535c3b5925#diff-3dcea545b20bc1145f2766e64ca91d4e547d6072462970fd67830e1be6cdf793R771
  6. https://github.com/monero-project/monero/compare/70ce45ae01f9ae68b98a21869782e3e200c2e790..21b68cd92740eccc41848b918d51df535c3b5925#diff-3dcea545b20bc1145f2766e64ca91d4e547d6072462970fd67830e1be6cdf793R2466
  7. https://github.com/monero-project/monero/compare/70ce45ae01f9ae68b98a21869782e3e200c2e790..21b68cd92740eccc41848b918d51df535c3b5925#diff-3dcea545b20bc1145f2766e64ca91d4e547d6072462970fd67830e1be6cdf793R2815
  8. https://github.com/monero-project/monero/compare/70ce45ae01f9ae68b98a21869782e3e200c2e790..21b68cd92740eccc41848b918d51df535c3b5925#diff-3dcea545b20bc1145f2766e64ca91d4e547d6072462970fd67830e1be6cdf793R2859

0xFFFC0000 avatar Mar 02 '24 08:03 0xFFFC0000

After carefully reading through your discussion with @vtnerd unfortunately I still could not form something like a "high level" understanding of the purpose and the aim of this PR. I feel that such an understanding is necessary to judge the "value" of this PR, and for going into any review with correct expectations.

I suspect the following:

With current code, there are no critical problems with locking. Things are always locked when they should get locked. There are no known races and / or deadlocks. Basically, nothing known that you could call "bug". But, because now all locks are exclusive, even for operations that could run in parallel without any problems, performance of DB access is lower than it could be, as all access gets strictly serialized. With introducing a distinction between non-exclusive read locks and exclusive read/write locks performance gains can be achieved. For introducing this distinction, no new layers or other drastic complications of the code are needed.

Is my understanding correct?

rbrunner7 avatar Mar 03 '24 09:03 rbrunner7

With current code, there are no critical problems with locking. Things are always locked when they should get locked. There are no known races and / or deadlocks. Basically, nothing known that you could call "bug". But, because now all locks are exclusive, even for operations that could run in parallel without any problems, performance of DB access is lower than it could be, as all access gets strictly serialized. With introducing a distinction between non-exclusive read locks and exclusive read/write locks performance gains can be achieved. For introducing this distinction, no new layers or other drastic complications of the code are needed.

Is my understanding correct?

Important Please keep in mind in the following comment serialization I am talking in the context of synchronization [1].

@rbrunner7 Yes, you are correct.

  1. The purpose of a ReadWrite lock is to allow fine-grained control over locking [2]. Having multiple readers concurrently does not impose any serialization issue. But having writer and reader threads together, or having multiple writer threads together, does pose a serious threat to serialization.
  2. The PR does replace the old locking mechanism we had. The old locking mechanism is an ordinary recursive_mutex lock, with no support for a read-write operation. This means that we lock the entire Blockchain even if it is not required.
  3. Yes, as you said it is correct and there is no specific bug. Therefore I didn't label/mention "bug" for this PR. But extremely inefficient design.
  4. And no new layers or other drastic complications of the code are added.
  5. Keep in mind this is an improvement for the locking mechanism. In the long term, we want to have a much more fine-grained (read-write) locking mechanism. Even inside Blockchain we sometimes lock the entire method, and that is not required. There are suggestions to introduce a single locking mechanism to take care of all of this in a single layer. There are arguments to do it in BlockchainLMDB or Blockchain or other suggestions. But we are nowhere close to that point right now. So that is a long-term goal.

Let me go through a complete example, keep in mind this is for illustration purposes and might some details. Imagine 2 threads are listening to RPC:

  1. Thread 1 receive on_mining_status request. It calls get_difficulty_for_next_block. Now look at the code for get_difficulty_for_next_block. You are writing to value m_timestamps_and_difficulties_height. So you need a lock to protect the values. Right now we are using this. Which is basically boost::recursive_mutex. A simple recursive lock. Now take a look at the src/cryptonote_core/blockchain.cpp:884 in PR.

Ref:

  1. https://en.wikipedia.org/wiki/Synchronization_(computer_science)
  2. https://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock

0xFFFC0000 avatar Mar 03 '24 09:03 0xFFFC0000

@0xFFFC0000 is correct in that a reader lock is necessary to access the db, since another function could modify the pointer, etc. Although that doesn't seem to happen often, as this would've needed changing a long time ago.

vtnerd avatar Mar 03 '24 14:03 vtnerd

@0xFFFC0000 is correct in that a reader lock is necessary to access the db, since another function could modify the pointer, etc. Although that doesn't seem to happen often, as this would've needed changing a long time ago.

It is not about the pointer. Take a look at the example I provided to rbrunner, or look at the code.

0xFFFC0000 avatar Mar 03 '24 14:03 0xFFFC0000

It is not about the pointer. Take a look at the example I provided to rbrunner, or look at the code.

You added locks where none were present before, that's why I keep going back to that example. But yes, it also changed to reader locks where a full mutex was present before.

vtnerd avatar Mar 03 '24 14:03 vtnerd

Or at least so I thought - I can't find any now.

vtnerd avatar Mar 03 '24 14:03 vtnerd

Sorry for spamming - ::scan_outputkeys_for_indexes was one such example I was thinking of.

vtnerd avatar Mar 03 '24 14:03 vtnerd

I've written this before but I'll also add a comment here. There was or is an issue where the daemon would fall behind if there's really heavy RPC traffic. I've had my node fall behind like 5+ blocks, after some time it would catch up again. The assumption was that it's due to the current lock mechanism. moneromooo suggested a while ago that a read / write lock might help. That's why I suggested @0xFFFC0000 to implement this.

This daemon falling behind issue happens rarely and is difficult to reproduce, so I don't know if this PR improves things, but I just wanted to add context why this was implemented.

selsta avatar Mar 03 '24 14:03 selsta

It is not about the pointer. Take a look at the example I provided to rbrunner, or look at the code.

You added locks where none were present before, that's why I keep going back to that example. But yes, it also changed to reader locks where a full mutex was present before.

Those 8 methods are different. I have put them here [1]. The design we had was to not use any locks there and put the burden of locking/consistency on the caller. That is understandable.

But IMHO we have to use reader lock there too. But since I don't want to introduce any semantical change to the code. I have already removed them.

Overall that is like extremely tiny part of the PR. Maybe 0.5%? or less!

  1. https://github.com/monero-project/monero/pull/9181#issuecomment-1974702927

0xFFFC0000 avatar Mar 03 '24 14:03 0xFFFC0000

I've written this before but I'll also add a comment here. There was or is an issue where the daemon would fall behind if there's really heavy RPC traffic. I've had my node fall behind like 5+ blocks, after some time it would catch up again. The assumption was that it's due to the current lock mechanism. moneromooo suggested a while ago that a read / write lock might help. That's why I suggested @0xFFFC0000 to implement this.

This daemon falling behind issue happens rarely and is difficult to reproduce, so I don't know if this PR improves things, but I just wanted to add context why this was implemented.

This PR might (and might not) directly improve that falling behind. But this is necessary work that we have to eventually take care of. The first step is to introduce a read-write lock. Then we have to reduce the locking surface. In my explanation to @rbrunner7 I mentioned:

  1. For example there are methods that we are locking the entire method. Which is useless. and will decrease performance.
  2. In some of the methods we will be able to use RLOCK instead of RWLOCK.

0xFFFC0000 avatar Mar 03 '24 14:03 0xFFFC0000

Sorry for spamming - ::scan_outputkeys_for_indexes was one such example I was thinking of.

No worries. We are evaluating a serious PR, and these discussions are normal.

Yes scan_outputkeys_for_indexes, that one has extra RLOCK. That is leftover from my tests and should be removed. I don't think any other one has extra locking. Since I have tried to intentionally not introduce any logical/semantical changes.

0xFFFC0000 avatar Mar 03 '24 15:03 0xFFFC0000

It is not about the pointer. Take a look at the example I provided to rbrunner, or look at the code.

You added locks where none were present before, that's why I keep going back to that example. But yes, it also changed to reader locks where a full mutex was present before.

Please let me know. Because this is a serious mistake in the code and completely against my design goal. One of my main goals was to not introduce any logical/semantical changes to the code while changing the underlying locking behaviour.

0xFFFC0000 avatar Mar 03 '24 15:03 0xFFFC0000

Updated the implementation of reader_writer. Changelog:

  1. Rebased to master and added recent changes from the master.
  2. Adding a queue to keep the order of operations. Reads are parallel, and write is exclusive, but now the order between read(s) and write will be preserved, and starvation cannot happen.
  3. Fixed the mistake in ::scan_outputkeys_for_indexes.
  4. Fixed the issue mentioned by @vtnerd about accessing a variable before locking.
  5. Added a few other minor test cases.

0xFFFC0000 avatar Mar 04 '24 20:03 0xFFFC0000

I took a quick look at the code and I don't see any issues (yet). I'll check more thoroughly later. What would be really nice, is if the unit test was compiled with -fsanitize=thread -Og -fno-omit-frame-pointer -g compiler flags. Thread sanitizer is a must for this kind of code.

SChernykh avatar Mar 05 '24 14:03 SChernykh

I took a quick look at the code and I don't see any issues (yet). I'll check more thoroughly later. What would be really nice, is if the unit test was compiled with -fsanitize=thread -Og -fno-omit-frame-pointer -g compiler flags. Thread sanitizer is a must for this kind of code.

Thanks. Great suggestion. Few things to consider.

  1. We do compile the test inside unit_tests executable. So one approach is to separate the lock test and compile it with its own flag.
  2. Another approach is use preprocessor to detect the compiler and only add those flags when the compiler is GCC. (I have to check whether these flags can be enabled on function specific basis with something like __attribute__.
  3. I have to rename the test. Since it is basically deadlock test, rather than consistency test.

0xFFFC0000 avatar Mar 05 '24 15:03 0xFFFC0000

I think it's better to separate it to another executable. But you can try to hack CMakeLists.txt with something like

if (CMAKE_CXX_COMPILER_ID MATCHES GNU OR CMAKE_CXX_COMPILER_ID MATCHES Clang)
  set_source_files_properties(reader_writer_lock_tests.cpp PROPERTIES COMPILE_FLAGS "-fsanitize=thread -Og -fno-omit-frame-pointer -g")
endif()

But I'm not sure if such mix will work properly.

P.S. After a bit more thinking, it's still better to make it a separate binary and run the test with a timeout in CI builds, because it can potentially deadlock.

SChernykh avatar Mar 05 '24 15:03 SChernykh

I think it's better to separate it to another executable.

But you can try to hack CMakeLists.txt with something like


if (CMAKE_CXX_COMPILER_ID MATCHES GNU OR CMAKE_CXX_COMPILER_ID MATCHES Clang)

  set_source_files_properties(reader_writer_lock_tests.cpp PROPERTIES COMPILE_FLAGS "-fsanitize=thread -Og -fno-omit-frame-pointer -g")

endif()

But I'm not sure if such mix will work properly.

P.S. After a bit more thinking, it's still better to make it a separate binary and run the test with a timeout in CI builds, because it can potentially deadlock.

Separate binary is preferable. Sadly Gtest doesn't provide any kind of timeout. So we have to make it separate.

About Cmake hack. I am not sure, since all of those source files are going to combined to a object file and optimized by linker. So passing those flags on source file basis might not work. Worth trying but. Will try it.

0xFFFC0000 avatar Mar 05 '24 15:03 0xFFFC0000

Gtest doesn't provide a timeout, but you can set the timeout in .yml files for the CI in https://github.com/monero-project/monero/tree/master/.github/workflows

SChernykh avatar Mar 05 '24 15:03 SChernykh

Gtest doesn't provide a timeout, but you can set the timeout in .yml files for the CI in https://github.com/monero-project/monero/tree/master/.github/workflows

Yes. I agree with you. Separate binary is more preferable here.

0xFFFC0000 avatar Mar 05 '24 15:03 0xFFFC0000

Looks like this implementation still suffers from reproducible writer starvation (though I haven't debugged the reason this time). Here's a test case I wrote for it: https://github.com/jeffro256/monero/commit/3ae55c6b85157886b86e303b8ae6701584e1149d. I wrote an alternative implementation for the recursive read/write lock in this commit here: https://github.com/jeffro256/monero/commit/98cb62ae26a9cd8349b98f04cad6ecf64885dad0. It only needs 1 boost::shared_mutex per instance and a map of integers per thread (thread-local, doesn't need to be synchronized). There isn't any wait queue or additional mutexes or condition variables, so it should be a little more lightweight and easier to review IMO. It also fully implements Lockable and SharedLockable C++14 concepts.

jeffro256 avatar Mar 05 '24 22:03 jeffro256

Looks like this implementation still suffers from reproducible writer starvation (though I haven't debugged the reason this time). Here's a test case I wrote for it: jeffro256@3ae55c6. I wrote an alternative implementation for the recursive read/write lock in this commit here: jeffro256@98cb62a. It only needs 1 boost::shared_mutex per instance and a map of integers per thread (thread-local, doesn't need to be synchronized). There isn't any wait queue or additional mutexes or condition variables, so it should be a little more lightweight and easier to review IMO. It also fully implements Lockable and SharedLockable C++14 concepts.

Just to double check. Are you sure you are using the new version? Because the lock I see in your test is the older one.

0xFFFC0000 avatar Mar 05 '24 22:03 0xFFFC0000

Looks like this implementation still suffers from reproducible writer starvation (though I haven't debugged the reason this time). Here's a test case I wrote for it: jeffro256@3ae55c6.

I think your description is not accurate. I tried your test with the new implementation but was not able to show writer starvation. (Just make sure you are updating your clone.)

Here is a modified version of your test [1]. Which includes serialized prints to debug the order of requests and acquires. It perfectly demonstrates that the writer once requests the lock, goes in order. As it should be. You can increase the number of readers, and no reader is able to successfully lock after a writer requests lock. Please let me know if you think I am making a mistake somewhere.

I wrote an alternative implementation for the recursive read/write lock in this commit here: jeffro256@98cb62a. It only needs 1 boost::shared_mutex per instance and a map of integers per thread (thread-local, doesn't need to be synchronized). There isn't any wait queue or additional mutexes or condition variables, so it should be a little more lightweight and easier to review IMO. It also fully implements Lockable and SharedLockable C++14 concepts.

I have to debug this lock. As we discussed in our private conversation, the problem is not writing a lock, as it is a very simple task, the problem is supporting every use case of the recursive locking/unlocking we have in the APIs, while not changing much of the code inside Blockchain. I will create a different draft PR with your lock. And if passes the tests, we can benchmark it. To be fair though, I have not optimized mine. So there are many optimization opportunities I will add. ;)

  1. https://paste.debian.net/1309586/

0xFFFC0000 avatar Mar 05 '24 23:03 0xFFFC0000

This repo contains performance benchmarks for two lock versions.

Mine is not optimized. But overall queue inefficiency does show itself in numbers. Although it is much simpler, overall this is not an accurate benchmark, but can give us a clue. For testing purposes, I will use @jeffro256 lock in PR next to see if it does all the tests of the Blockchain:blockchain.cpp.

image

reader_writer_lock lock is my version. Which has passed the tests. recursive_shared_mutex is @jeffro256 submitted version. But haven't been tested yet.

0xFFFC0000 avatar Mar 07 '24 21:03 0xFFFC0000