rippled
rippled copied to clipboard
TaggedCache: Fix swept object memory release which can occur within the …
…lock
High Level Overview of Change
When a std::shared_ptr is destructed, even if there is no other shared_ptr referencing the same object, the object memory is not necessarily released. This is because std::weak_ptr referencing the same object still need to reference the control block, and the control block is deleted only when the last std::weak_ptr is deleted.
When the object is allocated with std::make_shared, both the control block and the object are allocated within the same malloc block, and that malloc block is released only when no std::shared_ptr or std::weak_ptr reference it. Many commonly used objects are allocated with std::make_shared, including the SHAMap tree nodes.
As a result, the release of the memory used by many objects is still occurring within the lock in the TaggedCache (because it occurs when the std::weak_ptr is destructed).
Note: even prior to this change, the destructor for the shared objects is indeed called outside the lock, since the object destructor is called when the last strong reference (by
shared_ptr) is destructed. It is only the actual memory release which occurred within the lock when thestd::weak_ptris destroyed.
Context of Change
This is a performance issue. It has very likely been present for a very long time. I think this change should increase rippled's performance, as the TaggedCache lock will be held for a shorter time.
Type of Change
- [ x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Refactor (non-breaking change that only restructures code)
- [ ] Tests (You added tests for code that already exists, or your new feature included in this PR)
- [ ] Documentation Updates
- [ ] Release
This change gathers both pointer types ( std::shared_ptr and std::weak_ptr) in vectors which are destroyed outside of the lock, ensuring that the memory deallocation occurs outside the lock, for both object memory and control block, regardless of whether they are allocated together or not.
I ran the unittests (no failure), and verified that rippled syncs correctly. In my opinion the change is simple and straightforward enough that it is unlikely to cause any problem.
I haven't reviewed this closely, but I think we need to clarify at least the comment wording and maybe the commit message:
An object pointed to via std::shared_ptr will go "out of scope" when the last strong pointer to the object is released and that object's destructor WILL be called at that time.
Now, the underlying memory may not be released until the last weak pointer to the object goes away since, as you say, a std::weak_ptr to an object needs to reference the control block. So if a control block and the object where allocated in one go (e.g. via std::make_shared) the memory can only be released in one go as well.
Yes @nbougalis you are 100% correct, thanks for clarifying, I have updated the PR description.
Thanks @nbougalis I merged the commits and updated the message.
Is [[maybe_unused]] preferable to boost::ignore_unused()?
On Tue, Jul 26, 2022 at 6:19 PM Howard Hinnant @.***> wrote:
@.**** commented on this pull request.
In src/ripple/basics/TaggedCache.h https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_XRPLF_rippled_pull_4242-23discussion-5Fr930535303&d=DwMCaQ&c=ZYMNz-RjyvnVQtP-Q-05BQ&r=xf7_p1Sq69lFp8AHmF5CLmVUpm17oePTPvV4it5Y9ZQ&m=Hxze0eXR13FWKIhcENKCh8PhCsF8ekW83APAexBetgLv3zQM3ZR6rEDBN3-S1_Wb&s=Zi-J5e9Mx8RyhsdYWhYsKJGC_f2xysEB1F_V65W3kXI&e= :
@@ -722,17 +727,15 @@ class TaggedCache clock_type::time_point const& when_expire, clock_type::time_point const& now, typename KeyOnlyCacheType::map_type& partition,
std::vector<std::shared_ptr<mapped_type>>& stuffToSweep,std::atomic<int>& allRemovals,std::lock_guard<std::recursive_mutex> const& lock)
SweptPointersVector& stuffToSweep,Hmm... making guidelines up on the fly, which is never a good idea. But perhaps remove the variable name if it is definitely never used, and use [[maybe_unused]] if it might or might_not be used, such as in an assert or under a ifdef.
— Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_XRPLF_rippled_pull_4242-23discussion-5Fr930535303&d=DwMCaQ&c=ZYMNz-RjyvnVQtP-Q-05BQ&r=xf7_p1Sq69lFp8AHmF5CLmVUpm17oePTPvV4it5Y9ZQ&m=Hxze0eXR13FWKIhcENKCh8PhCsF8ekW83APAexBetgLv3zQM3ZR6rEDBN3-S1_Wb&s=Zi-J5e9Mx8RyhsdYWhYsKJGC_f2xysEB1F_V65W3kXI&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AB26YHLW2QYISSYOTXCC4QTVWCFA7ANCNFSM534XPXLQ&d=DwMCaQ&c=ZYMNz-RjyvnVQtP-Q-05BQ&r=xf7_p1Sq69lFp8AHmF5CLmVUpm17oePTPvV4it5Y9ZQ&m=Hxze0eXR13FWKIhcENKCh8PhCsF8ekW83APAexBetgLv3zQM3ZR6rEDBN3-S1_Wb&s=7HOMTUo6Qzyy_B13HOFgUcWUz0uMIBsgMSz5QYoQeI4&e= . You are receiving this because your review was requested.Message ID: @.***>
Yes, in C++17 and later, because it is in the C++ language.
pre-C++17 I would actually prefer:
(void)foo;
over:
boost::ignore_unused(foo);
The void solution is a decades old pattern that is well understood and doesn't depend on a library. I don't know what problem boost::ignore_unused solves.
@HowardHinnant @nbougalis @ximinez @mtrippled I think I would like to abandon this PR.
Because of the effort that was made to release the SHAMAPTreeNodes outside the lock, I was under the impression that holding the cache lock for the minimum time possible was of the utmost importance, and that is why I proposed this change. It seemed like a good idea, but I did not test whether it the change had any impact on performance.
However, after submitting the PR, I experimented with replacing the hardened_partitioned_hash_map with another hash map of mine which doesn't hold a global lock, but a lock for each partition (basically this commit ), and this should have eliminated most locking, but the time for rippled to sync increased slightly from 4 minutes to almost 5 minutes (maybe this is not the right benchmark), so I am not sure anymore that reducing slightly the locking duration is critical.
Please let me know what you guys think. Thanks!
@greg7mdp I think that this improvement is beneficial, but probably not measurably so. It can't hurt--I always assumed that an expired weak_ptr had already been destroyed, but I guess that's not necessarily the case. I don't see a problem with keeping this, and I've already approved it.
Also, @greg7mdp you can measure the impact of your change with normal operation by grepping for "sweep lock duration" from the debug log. That measures the duration between the first partition lock being acquired and the last partition lock being freed. There are several caches like this, but the worst one is the "tree node cache". Any decrease in lock duration is a good thing, and this is a way to easily measure any changes.
I agree with @mtrippled. I don't think this should be abandoned unless the sweep messages indicate degraded performance.
OK, fair enough and thanks @mtrippled and @ximinez , I'll make the requested changes.