Intrusive shamap inner final
High Level Overview of Change
This PR finalises the work authored by Scott Determan (https://github.com/seelabs) and is based on the original PR (https://github.com/XRPLF/rippled/pull/4815).
Context of Change
There are two goals:
- Synchronise this change with the most recent
developbranch. - Address outstanding questions raised in the original PR.
Type of Change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [x] Refactor (non-breaking change that only restructures code)
- [x] Performance (increase or change in throughput and/or latency)
- [ ] Tests (you added tests for code that already exists, or your new feature included in this PR)
- [ ] Documentation update
- [ ] Chore (no impact to binary, e.g.
.gitignore, formatting, dropping support for older tooling) - [ ] Release
Codecov Report
Attention: Patch coverage is 86.05769% with 116 lines in your changes missing coverage. Please review.
Project coverage is 78.1%. Comparing base (
2bc5cb2) to head (d4a4344). Report is 1 commits behind head on develop.
Additional details and impacted files
@@ Coverage Diff @@
## develop #5152 +/- ##
========================================
Coverage 78.1% 78.1%
========================================
Files 790 795 +5
Lines 67988 68447 +459
Branches 8251 8277 +26
========================================
+ Hits 53092 53482 +390
- Misses 14896 14965 +69
| Files with missing lines | Coverage Δ | |
|---|---|---|
| include/xrpl/basics/TaggedCache.h | 100.0% <100.0%> (+12.8%) |
:arrow_up: |
| src/xrpld/app/ledger/detail/LedgerMaster.cpp | 44.0% <ø> (ø) |
|
| src/xrpld/app/ledger/detail/TransactionMaster.cpp | 73.8% <ø> (ø) |
|
| src/xrpld/app/main/Application.h | 100.0% <ø> (ø) |
|
| src/xrpld/ledger/detail/CachedView.cpp | 100.0% <ø> (ø) |
|
| src/xrpld/nodestore/Database.h | 69.2% <ø> (ø) |
|
| src/xrpld/shamap/SHAMap.h | 100.0% <ø> (ø) |
|
| src/xrpld/shamap/SHAMapAccountStateLeafNode.h | 100.0% <100.0%> (ø) |
|
| src/xrpld/shamap/SHAMapInnerNode.h | 88.2% <ø> (ø) |
|
| src/xrpld/shamap/SHAMapLeafNode.h | 100.0% <ø> (ø) |
|
| ... and 16 more |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
Analysis of Reference Count Ranges for Intrusive Smart Pointers
Background
Following the conversation in the original PR (Intrusive shamap inner (SHAMapTreeNode memory reduction) by seelabs · Pull Request #4815 · XRPLF/rippled ), it was raised that unlike the standard library shared_ptr and weak_ptr, the newly introduced intrusive versions have narrower ranges for storing reference counts. The proposed change sets ranges as:
- For strong references: 65535
- For weak references: 16383
Questions
- The task is to do a code audit and prepare tests to check possible maximum reference number counts that can occur in the current version of rippled.
- Decide if the proposed ranges are enough for the current version and the near future. It is possible to increase ranges in the future, while the move to intrusive smart pointers will still be beneficial.
Code audit
Strong references
From analyzing the code:
Theoretical Maximum = (shareChild calls) X (number of ledgers containing the same node)
where
shareChild calls - the shareChild calls during tree traversal (walkSubTree).
number of ledgers containing the same node - while generating a ledger, the same transaction might be added to several versions of the ledger until one of them gets accepted by consensus. Therefore, the same node may get referenced from multiple trees representing different ledger versions.
Worst-case scenario:
- shareChild calls during tree traversal = 2
- Given network of 35 validators
- 5-second interval to reach a consensus
- 15-second interval deadline before network reset
Theoretical Maximum value: 210 = 2 X (15 / 5) X 35
Weak references
Class WeakIntrusive is not used explicitly or implicitly at the moment. The only place where the weak pointer is used is in the conversion from strong to weak when sweeping the TaggedCache. This means that the number of weak reference counts can never be higher than the number of strong references.
Tests
Temporary code changes
- [WIP ]Intrusive shamap inner final by vlntb · Pull Request #5152 · XRPLF/rippled
- [WIP ]Intrusive shamap inner final by vlntb · Pull Request #5152 · XRPLF/rippled
Test runs
- 12 rippled sessions ranging in duration from 1 hr to 24 hrs
- Network: livenet
- State: proposing
Test results
- Maximum number of strong references: 387
- Maximum number of weak references: 1
The test result of 387 references being observed is much higher than the theoretical maximum. This suggests:
There may be excessive copying of nodes during the initialization phase or traversal.
Certain caching mechanisms (like TaggedCache or similar) can have an effect on the reference count.
Conclusion
- Strong Reference Count Range: The proposed limit of 65535 is more than adequate, and the logic for calculating the theoretical maximum (210) is sound. The observed discrepancy (387) highlights a need to investigate potential inefficiencies in node copying or caching.
- Weak Reference Count Range: The proposed limit of 16383 is also sufficient, and the observed maximum (1) confirms that weak references are minimal under current usage patterns.
- Actionable Insight: The excessive copying or caching logic leading to 387 references warrants further investigation to improve efficiency.
The theoretical maximal value for string references is calculated to be 210. Experimental evidence also from the readme detects a value above the theoretical maximum: 387. I ran a server for about an hour and detected a max of 1908.
These are all well below the limits of 65535, so this limit is probably safe. But it wouldn't hurt to revisit the theoretical maximum and discover why it is incorrect.
I did additional digging following a comment from @HowardHinnant. What I didn't take into account is that rippled is processing transaction or ledger data in a concurrent environment. I identified four types of routines that can happen in parallel:
- InboundLedgersImp::gotLedgerData
- SHAMap::walkTowardsKey
- SHAMap::flushDirty
- LedgerMaster::gotFetchPack
Two of those routines are executed from the JobQueue and can be parallized further based on the node_size configuration parameter. The difference in this parameter explains the difference in the result that Howard and I received. Howard had his node_size set as huge, resulting in 8 threads in the JobQueue pool. In my setup, I had it defined as medium, resulting in 4 threads.
Based on those findings, we should update the Theoretical Maximum value.
- Theoretical Maximum value for a single thread: 210 = 2 X (15 / 5) X 35.
- InboundLedgersImp::gotLedgerData can be executed across a maximum of 8 threads.
- SHAMap::walkTowardsKey - 1 thread
- SHAMap::flushDirty - 1 thread
- LedgerMaster::gotFetchPack - can be executed across a maximum of 8 threads.
Giving overall Theoretical Maximum value as: 8 x 210 + 210 + 210 + 8 x 210 = 3780.
This is still significantly lower than the allocated 65535 range.
See the original PR and review comments here: https://github.com/XRPLF/rippled/pull/4815
Should these flags be defined in the top level CMakeLists.txt instead of cmake/RippledCompiler.cmake? Eventually, I'd like to move all this to CMakePresets.json anyway.
-Wno-subobject-linkage produces
cc1: warning: command-line option ‘-Wno-subobject-linkage’ is valid for C++/ObjC++ but not for C
nit: Why check each compiler if they're mutually exclusive (if/if/if)?
nit: the if ( i.e. <CMakeFunction>[::space::]rippled but the docs and most newer CMake specify no space so I've been trying to stick to that and slowly convert it everywhere I see it.
Should these flags be defined in the top level
CMakeLists.txtinstead ofcmake/RippledCompiler.cmake? Eventually, I'd like to move all this toCMakePresets.jsonanyway.
-Wno-subobject-linkageproducescc1: warning: command-line option ‘-Wno-subobject-linkage’ is valid for C++/ObjC++ but not for Cnit: Why check each compiler if they're mutually exclusive (if/if/if)?
nit: the
if (i.e. [::space::] is all over the CMake forrippledbut the docs and most newer CMake specify no space so I've been trying to stick to that and slowly convert it everywhere I see it.
- The reason for having those flags on the current level is that other
add_compile_optionsare already defined there. We can, in the future, move everything to the level up, but we should do this for everything at once. - Addressed exclusivity comment - moved to if-elseif-ifelse.
- Removed space after
ifand an empty line to follow the convention.