rippled
rippled copied to clipboard
Add configurable NuDB block size feature
High Level Overview of Change
This PR implements configurable NuDB block size support in rippled, allowing operators to optimize storage performance based on their hardware configuration. The feature adds a new nudb_block_size configuration parameter that enables block sizes from 4K to 32K bytes, with comprehensive validation and backward compatibility.
Solution to Issue: https://github.com/XRPLF/rippled/issues/5361
Context of Change
As XRPL network demand grows and ledger sizes increase, the default 4K NuDB block size becomes a performance bottleneck, especially on high-performance storage systems. Modern SSDs and enterprise storage often perform better with larger block sizes, but rippled previously had no way to configure this parameter.
Performance Impact Demonstrated:
- Eliminated transaction queue overflows (0 vs 958 overflows) (tested with a 1y old validator vs. new one)
- Improved system stability with minimal state transitions
FIO Benchmarks:
- Writes 16K recordsize
RANDOM WRITE: bw=3738MiB/s (3920MB/s), 3738MiB/s-3738MiB/s (3920MB/s-3920MB/s), io=110GiB (118GB), run=30001-30001msec
Seq. WRITE: bw=6371MiB/s (6681MB/s), 6371MiB/s-6371MiB/s (6681MB/s-6681MB/s), io=128GiB (137GB), run=20573-20573msec
- Writes 4K recordsize
RANDOM WRITE: bw=1038MiB/s (1089MB/s), 1038MiB/s-1038MiB/s (1089MB/s-1089MB/s), io=30.4GiB (32.7GB), run=30002-30002msec
Seq. WRITE: bw=1683MiB/s (1765MB/s), 1683MiB/s-1683MiB/s (1765MB/s-1765MB/s), io=49.3GiB (53.0GB), run=30001-30001msec
Full report: ZFS-16-4-K-Fio-Bench.txt
Type of Change
☑️ New feature (non-breaking change which adds functionality)
☑️ Performance (increase in throughput and/or latency)
☑️ Documentation update
API Impact
☑️ libxrpl change (any change that may affect libxrpl or dependents of libxrpl)
Implementation Details
Code Changes:
- Added
parseBlockSize()function with robust validation (power of 2, 4K-32K range) - New configuration parameter:
nudb_block_sizein[node_db]section - Comprehensive error handling with informative log messages
- Backward compatibility maintained (defaults to 4K if not specified)
- Updated documentation in
rippled-example.cfgwith usage guidelines
Configuration Example:
[node_db]
type=NuDB
path=/var/lib/rippled/db/nudb
nudb_block_size=16384 # 16K blocks for high-performance systems
Validation Rules:
- Must be power of 2 between 4096 and 32768 bytes
- Invalid values fall back to 4096 with warning logs
- Startup logging confirms active block size
Performance Recommendations:
- 4096: Default, balanced performance
- 8192: Good compromise for most systems
- 16384: Optimal for high-performance validators
- 32768: Maximum performance on enterprise SSDs / Storage Clusters
This enhancement enables rippled operators to optimize their node performance based on specific hardware capabilities while maintaining full backward compatibility.
Codecov Report
:x: Patch coverage is 90.90909% with 3 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 79.5%. Comparing base (83ee378) to head (2913b18).
:warning: Report is 1 commits behind head on develop.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/xrpld/nodestore/Backend.h | 0.0% | 2 Missing :warning: |
| src/xrpld/nodestore/backend/NuDBFactory.cpp | 96.8% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #5468 +/- ##
=========================================
- Coverage 79.5% 79.5% -0.0%
=========================================
Files 817 817
Lines 72177 72209 +32
Branches 8279 8280 +1
=========================================
+ Hits 57359 57384 +25
- Misses 14818 14825 +7
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/xrpld/nodestore/backend/NuDBFactory.cpp | 67.6% <96.8%> (+6.0%) |
:arrow_up: |
| src/xrpld/nodestore/Backend.h | 12.5% <0.0%> (-4.2%) |
:arrow_down: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@bthomee @legleux
I have added test coverage and changed the max block size limit to 32K because it's the current limit of nudb.
Do I need to provide anything further before it can be reviewed?
Thanks in advance!
It seems to do everything it says on the tin from my end. I'm not sure what the real-world performance implications for operators could be. Still running some tests running this commit on the devnet.
I've pinged the performance team to take a look also.
Thank you very much, highly appreciate your fast response.
It seems to do everything it says on the tin from my end. I'm not sure what the real-world performance implications for operators could be. Still running some tests running this commit on the devnet.
I've pinged the performance team to take a look also.
It highly depends on the setup.
In my case, as described above, I have my validator running on top of a ZFS mirrored NVMe cluster where I can create datasets/vdevs on demand with specific block sizes, so I can align my filesystem with the application's recommended block sizes to have ideal performance/disk utilization.
Many enterprise NVMe drives operate with larger native block sizes (4K, 8K, or even 16K+); using misaligned 4K in nudb with suboptimal configurations in between (like RAID, volume manager, filesystem) results in not using your hardware/software to its fullest, which can then further have an impact on multiple aspects.
In any case, since nudb offers the ability to set the block size (as almost any serious database software does for exactly those reasons) it only makes sense to add it to rippled's config options.
What happens if one changes the block size setting when one has an existing database?
What happens if one changes the block size setting when one has an existing database?
Just tested it, nothing.
The implementation adds the blocksize when db is being created but does nothing when it's already present.
So if you want to change the blocksize nudb should use, you will need to delete the existing db.
This is also mentioned in the comments.
https://github.com/treeol/rippled/blob/086b9f62d436006b0c485dc6b9c6e2a0fade7bf4/cfg/rippled-example.cfg#L1005
Thank you; clearly should have RTFM :-). Do you have any further insight into the potential increase[d] memory usage and reduce[d] efficiency for smaller databases.? I note from NuDB's readme that they comment The implementation tries to fit as many entries as possible in a key file record, to maximize the amount of useful work performed per I/O. Do you know if there's some inherent sparseness required of these blocks that would cause storage usage to scale proportionally to both block and content size? I'm sort of wondering why we wouldn't just use the max size supported (especially given it's only 32K) and call it a day.
For context: I'm on the 'performance engineering' team for XRPL here at Ripple, and we've been asked to take a look at this change. Given this requires recreating the DB to change the block sizes, we're going to have to do [and thus schedule] some work on our end to allow us to dump + recreate the database state we use for our testing scenarios, so turn-around isn't going to be as fast as I'd like, but I'll do what I can to try and get results back in a time frame that's at least somewhat satisfying.
...I'm sort of wondering why we wouldn't just use the max size supported (especially given it's only 32K) and call it a day.
@lmaisons
Long story short, there's no inherent sparseness enforced in NuDB—it always tries to fill blocks as much as possible.
Fixing it at 32K would punish operators who can't offer 32K blocksize storage.
Also, the best value to use is one that matches your underlying storage system's native block size.
Hope that's clarifying.
@ximinez
Thank you very much for the review, comments and your changes!
I have cherry-picked the whole commit, build and tests ran successfully.
Indeed, using a block size that aligns with the environment (hardware and software such as zfs/recordsize) not only improves performance but also maximizes the service's potential throughput.
We could also argue that currently, all Server SSDs running rippled are underperforming, as they very likely have LBA sizes higher than 4K.
I've been running my validator with this patch for quite some time now, and it has resolved the issue of occasionally missing ledgers. Before this modification, I could only achieve consistent performance (no missing Ledgers) by using a non-redundant/striped disk configuration.
Although, I should mention that since 2.5.0, my validator regularly misses ledgers daily, but this appears to be affecting all validators equally with 2.5.0. However, my validator still has one of the lowest rates of missed ledgers.
@Bronek
I also added the experimental message in rippled-example.cfg, but to be completely honest, I don't consider this feature experimental since it's not a new nudb capability—we're simply making it accessible within rippled.
@Bronek
I also added the experimental message in rippled-example.cfg, but to be completely honest, I don't consider this feature experimental since it's not a new nudb capability—we're simply making it accessible within rippled.
Thank you - I understand this is not new to nudb, and we will remove the warning in a future release once the operators have collected enough experience running rippled with different block sizes. For now, using different block size in rippled is experimental - even if proven to work very well in a small number of configurations.
I just noticed that Github is complaining about your commits not being signed. Please set up a signing key, sign the commits in this PR, and set up signature verification with Github.
Thank you very much for your reply. Well, I apologize for the inconvenience, but I messed it up while signing my last commit/fixing the signing after merge from the develop branch.... I'm not sure now if this is the right approach. Can we maybe change the branch for this PR?
Thank you very much for your reply. Well, I apologize for the inconvenience, but I messed it up while signing my last commit/fixing the signing after merge from the develop branch.... I'm not sure now if this is the right approach. Can we maybe change the branch for this PR?
No worries. The good news is that you can basically start over. Notice that Github has status updates indicating when you did force pushes. The first one indicates that you started with commit f50a0a9. So you can reset to that commit, and start over.
Rough outline:
git reset --hard f50a0a9
git rebase --interactive --rebase-merges
Now go through the list of commits and
- Remove all the lines that start with
merge - Change all the
picklines toedit
It'll stop at each of your commits. When it does, sign the commit and continue.
git commit --amend --no-edit -S
git rebase --continue
And when it's all done, force push back to this branch.
We don't normally advocate rewriting commit history on non-draft PRs, but since you have to do that to sign anyway, may as well do the whole shebang.
Thank you very much for your reply. Well, I apologize for the inconvenience, but I messed it up while signing my last commit/fixing the signing after merge from the develop branch.... I'm not sure now if this is the right approach. Can we maybe change the branch for this PR?
No worries. The good news is that you can basically start over. Notice that Github has status updates indicating when you did force pushes. The first one indicates that you started with commit f50a0a9. So you can reset to that commit, and start over.
Rough outline:
git reset --hard f50a0a9 git rebase --interactive --rebase-mergesNow go through the list of commits and
- Remove all the lines that start with
merge- Change all the
picklines toeditIt'll stop at each of your commits. When it does, sign the commit and continue.
git commit --amend --no-edit -S git rebase --continueAnd when it's all done, force push back to this branch.
We don't normally advocate rewriting commit history on non-draft PRs, but since you have to do that to sign anyway, may as well do the whole shebang.
Thank you very much!
I think it looks good now. The commit is signed and the last of this branch.
If there is anything else I can do, please let me know.
I think it looks good now. The commit is signed and the last of this branch.
If there is anything else I can do, please let me know.
Note, we require all of the commits to be signed. You will have to do git commit --amend --no-edit -S for every single commit while you are rebasing.
I think it looks good now. The commit is signed and the last of this branch. If there is anything else I can do, please let me know.
Note, we require all of the commits to be signed. You will have to do
git commit --amend --no-edit -Sfor every single commit while you are rebasing.
Another option, since you're rebasing anyway is to squash all your commits into one, then sign the one.
I think it looks good now. The commit is signed and the last of this branch. If there is anything else I can do, please let me know.
Note, we require all of the commits to be signed. You will have to do
git commit --amend --no-edit -Sfor every single commit while you are rebasing.Another option, since you're rebasing anyway is to squash all your commits into one, then sign the one.
This was imho cleanest approach, I rebased it, cherry picked all needed commits and signed/pushed it.
@treeol This looks ready to merge, but part of our process requires the original author to sign off that it's ready after all the approvals are done. Could you just leave a quick comment indicating that this is good to go? And if that first commit message needs to be updated, please also leave the updated commit message here as a comment. Thanks!
Thank you @ximinez and all involved people for approving, contributing and helping getting this PR merged.
Also from my side, this is good to go.
The initial commit message looks good, so imho no changes/adjustments needed.