rippled icon indicating copy to clipboard operation
rippled copied to clipboard

Add configurable NuDB block size feature

Open treeol opened this issue 5 months ago • 1 comments

High Level Overview of Change

This PR implements configurable NuDB block size support in rippled, allowing operators to optimize storage performance based on their hardware configuration. The feature adds a new nudb_block_size configuration parameter that enables block sizes from 4K to 32K bytes, with comprehensive validation and backward compatibility.

Solution to Issue: https://github.com/XRPLF/rippled/issues/5361

Context of Change

As XRPL network demand grows and ledger sizes increase, the default 4K NuDB block size becomes a performance bottleneck, especially on high-performance storage systems. Modern SSDs and enterprise storage often perform better with larger block sizes, but rippled previously had no way to configure this parameter.

Performance Impact Demonstrated:

  • Eliminated transaction queue overflows (0 vs 958 overflows) (tested with a 1y old validator vs. new one)
  • Improved system stability with minimal state transitions

FIO Benchmarks:

  • Writes 16K recordsize

RANDOM WRITE: bw=3738MiB/s (3920MB/s), 3738MiB/s-3738MiB/s (3920MB/s-3920MB/s), io=110GiB (118GB), run=30001-30001msec

Seq. WRITE: bw=6371MiB/s (6681MB/s), 6371MiB/s-6371MiB/s (6681MB/s-6681MB/s), io=128GiB (137GB), run=20573-20573msec

  • Writes 4K recordsize

RANDOM WRITE: bw=1038MiB/s (1089MB/s), 1038MiB/s-1038MiB/s (1089MB/s-1089MB/s), io=30.4GiB (32.7GB), run=30002-30002msec

Seq. WRITE: bw=1683MiB/s (1765MB/s), 1683MiB/s-1683MiB/s (1765MB/s-1765MB/s), io=49.3GiB (53.0GB), run=30001-30001msec

Full report: ZFS-16-4-K-Fio-Bench.txt

Type of Change

☑️ New feature (non-breaking change which adds functionality)
☑️ Performance (increase in throughput and/or latency)
☑️ Documentation update

API Impact

☑️ libxrpl change (any change that may affect libxrpl or dependents of libxrpl)

Implementation Details

Code Changes:

  • Added parseBlockSize() function with robust validation (power of 2, 4K-32K range)
  • New configuration parameter: nudb_block_size in [node_db] section
  • Comprehensive error handling with informative log messages
  • Backward compatibility maintained (defaults to 4K if not specified)
  • Updated documentation in rippled-example.cfg with usage guidelines

Configuration Example:

[node_db]
type=NuDB
path=/var/lib/rippled/db/nudb
nudb_block_size=16384  # 16K blocks for high-performance systems

Validation Rules:

  • Must be power of 2 between 4096 and 32768 bytes
  • Invalid values fall back to 4096 with warning logs
  • Startup logging confirms active block size

Performance Recommendations:

  • 4096: Default, balanced performance
  • 8192: Good compromise for most systems
  • 16384: Optimal for high-performance validators
  • 32768: Maximum performance on enterprise SSDs / Storage Clusters

This enhancement enables rippled operators to optimize their node performance based on specific hardware capabilities while maintaining full backward compatibility.

treeol avatar Jun 03 '25 16:06 treeol

Codecov Report

:x: Patch coverage is 90.90909% with 3 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 79.5%. Comparing base (83ee378) to head (2913b18). :warning: Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
src/xrpld/nodestore/Backend.h 0.0% 2 Missing :warning:
src/xrpld/nodestore/backend/NuDBFactory.cpp 96.8% 1 Missing :warning:
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           develop   #5468     +/-   ##
=========================================
- Coverage     79.5%   79.5%   -0.0%     
=========================================
  Files          817     817             
  Lines        72177   72209     +32     
  Branches      8279    8280      +1     
=========================================
+ Hits         57359   57384     +25     
- Misses       14818   14825      +7     
Files with missing lines Coverage Δ
src/xrpld/nodestore/backend/NuDBFactory.cpp 67.6% <96.8%> (+6.0%) :arrow_up:
src/xrpld/nodestore/Backend.h 12.5% <0.0%> (-4.2%) :arrow_down:

... and 4 files with indirect coverage changes

Impacted file tree graph

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Jun 06 '25 05:06 codecov[bot]

@bthomee @legleux

I have added test coverage and changed the max block size limit to 32K because it's the current limit of nudb.

Do I need to provide anything further before it can be reviewed?

Thanks in advance!

treeol avatar Jul 14 '25 13:07 treeol

It seems to do everything it says on the tin from my end. I'm not sure what the real-world performance implications for operators could be. Still running some tests running this commit on the devnet.

I've pinged the performance team to take a look also.

legleux avatar Jul 17 '25 19:07 legleux

Thank you very much, highly appreciate your fast response.

It seems to do everything it says on the tin from my end. I'm not sure what the real-world performance implications for operators could be. Still running some tests running this commit on the devnet.

I've pinged the performance team to take a look also.

It highly depends on the setup.

In my case, as described above, I have my validator running on top of a ZFS mirrored NVMe cluster where I can create datasets/vdevs on demand with specific block sizes, so I can align my filesystem with the application's recommended block sizes to have ideal performance/disk utilization.

Many enterprise NVMe drives operate with larger native block sizes (4K, 8K, or even 16K+); using misaligned 4K in nudb with suboptimal configurations in between (like RAID, volume manager, filesystem) results in not using your hardware/software to its fullest, which can then further have an impact on multiple aspects.

In any case, since nudb offers the ability to set the block size (as almost any serious database software does for exactly those reasons) it only makes sense to add it to rippled's config options.

treeol avatar Jul 17 '25 21:07 treeol

What happens if one changes the block size setting when one has an existing database?

lmaisons avatar Jul 18 '25 00:07 lmaisons

What happens if one changes the block size setting when one has an existing database?

Just tested it, nothing.

The implementation adds the blocksize when db is being created but does nothing when it's already present.

So if you want to change the blocksize nudb should use, you will need to delete the existing db.

This is also mentioned in the comments.

https://github.com/treeol/rippled/blob/086b9f62d436006b0c485dc6b9c6e2a0fade7bf4/cfg/rippled-example.cfg#L1005

treeol avatar Jul 18 '25 01:07 treeol

Thank you; clearly should have RTFM :-). Do you have any further insight into the potential increase[d] memory usage and reduce[d] efficiency for smaller databases.? I note from NuDB's readme that they comment The implementation tries to fit as many entries as possible in a key file record, to maximize the amount of useful work performed per I/O. Do you know if there's some inherent sparseness required of these blocks that would cause storage usage to scale proportionally to both block and content size? I'm sort of wondering why we wouldn't just use the max size supported (especially given it's only 32K) and call it a day.

For context: I'm on the 'performance engineering' team for XRPL here at Ripple, and we've been asked to take a look at this change. Given this requires recreating the DB to change the block sizes, we're going to have to do [and thus schedule] some work on our end to allow us to dump + recreate the database state we use for our testing scenarios, so turn-around isn't going to be as fast as I'd like, but I'll do what I can to try and get results back in a time frame that's at least somewhat satisfying.

lmaisons avatar Jul 18 '25 02:07 lmaisons

...I'm sort of wondering why we wouldn't just use the max size supported (especially given it's only 32K) and call it a day.

@lmaisons

Long story short, there's no inherent sparseness enforced in NuDB—it always tries to fill blocks as much as possible.

Fixing it at 32K would punish operators who can't offer 32K blocksize storage.

Also, the best value to use is one that matches your underlying storage system's native block size.

Hope that's clarifying.

treeol avatar Jul 21 '25 15:07 treeol

@ximinez

Thank you very much for the review, comments and your changes!

I have cherry-picked the whole commit, build and tests ran successfully.

Indeed, using a block size that aligns with the environment (hardware and software such as zfs/recordsize) not only improves performance but also maximizes the service's potential throughput.

We could also argue that currently, all Server SSDs running rippled are underperforming, as they very likely have LBA sizes higher than 4K.

I've been running my validator with this patch for quite some time now, and it has resolved the issue of occasionally missing ledgers. Before this modification, I could only achieve consistent performance (no missing Ledgers) by using a non-redundant/striped disk configuration.

Although, I should mention that since 2.5.0, my validator regularly misses ledgers daily, but this appears to be affecting all validators equally with 2.5.0. However, my validator still has one of the lowest rates of missed ledgers.

@Bronek

I also added the experimental message in rippled-example.cfg, but to be completely honest, I don't consider this feature experimental since it's not a new nudb capability—we're simply making it accessible within rippled.

treeol avatar Aug 12 '25 18:08 treeol

@Bronek

I also added the experimental message in rippled-example.cfg, but to be completely honest, I don't consider this feature experimental since it's not a new nudb capability—we're simply making it accessible within rippled.

Thank you - I understand this is not new to nudb, and we will remove the warning in a future release once the operators have collected enough experience running rippled with different block sizes. For now, using different block size in rippled is experimental - even if proven to work very well in a small number of configurations.

Bronek avatar Aug 12 '25 18:08 Bronek

I just noticed that Github is complaining about your commits not being signed. Please set up a signing key, sign the commits in this PR, and set up signature verification with Github.

Thank you very much for your reply. Well, I apologize for the inconvenience, but I messed it up while signing my last commit/fixing the signing after merge from the develop branch.... I'm not sure now if this is the right approach. Can we maybe change the branch for this PR?

treeol avatar Aug 28 '25 00:08 treeol

Thank you very much for your reply. Well, I apologize for the inconvenience, but I messed it up while signing my last commit/fixing the signing after merge from the develop branch.... I'm not sure now if this is the right approach. Can we maybe change the branch for this PR?

No worries. The good news is that you can basically start over. Notice that Github has status updates indicating when you did force pushes. The first one indicates that you started with commit f50a0a9. So you can reset to that commit, and start over.

Rough outline:

git reset --hard f50a0a9
git rebase --interactive --rebase-merges

Now go through the list of commits and

  1. Remove all the lines that start with merge
  2. Change all the pick lines to edit

It'll stop at each of your commits. When it does, sign the commit and continue.

git commit --amend --no-edit -S
git rebase --continue

And when it's all done, force push back to this branch.

We don't normally advocate rewriting commit history on non-draft PRs, but since you have to do that to sign anyway, may as well do the whole shebang.

ximinez avatar Aug 28 '25 23:08 ximinez

Thank you very much for your reply. Well, I apologize for the inconvenience, but I messed it up while signing my last commit/fixing the signing after merge from the develop branch.... I'm not sure now if this is the right approach. Can we maybe change the branch for this PR?

No worries. The good news is that you can basically start over. Notice that Github has status updates indicating when you did force pushes. The first one indicates that you started with commit f50a0a9. So you can reset to that commit, and start over.

Rough outline:

git reset --hard f50a0a9
git rebase --interactive --rebase-merges

Now go through the list of commits and

  1. Remove all the lines that start with merge
  2. Change all the pick lines to edit

It'll stop at each of your commits. When it does, sign the commit and continue.

git commit --amend --no-edit -S
git rebase --continue

And when it's all done, force push back to this branch.

We don't normally advocate rewriting commit history on non-draft PRs, but since you have to do that to sign anyway, may as well do the whole shebang.

Thank you very much!

I think it looks good now. The commit is signed and the last of this branch.

If there is anything else I can do, please let me know.

treeol avatar Aug 29 '25 06:08 treeol

I think it looks good now. The commit is signed and the last of this branch.

If there is anything else I can do, please let me know.

Note, we require all of the commits to be signed. You will have to do git commit --amend --no-edit -S for every single commit while you are rebasing.

Bronek avatar Aug 29 '25 08:08 Bronek

I think it looks good now. The commit is signed and the last of this branch. If there is anything else I can do, please let me know.

Note, we require all of the commits to be signed. You will have to do git commit --amend --no-edit -S for every single commit while you are rebasing.

Another option, since you're rebasing anyway is to squash all your commits into one, then sign the one.

ximinez avatar Aug 29 '25 18:08 ximinez

I think it looks good now. The commit is signed and the last of this branch. If there is anything else I can do, please let me know.

Note, we require all of the commits to be signed. You will have to do git commit --amend --no-edit -S for every single commit while you are rebasing.

Another option, since you're rebasing anyway is to squash all your commits into one, then sign the one.

This was imho cleanest approach, I rebased it, cherry picked all needed commits and signed/pushed it.

treeol avatar Sep 01 '25 07:09 treeol

@treeol This looks ready to merge, but part of our process requires the original author to sign off that it's ready after all the approvals are done. Could you just leave a quick comment indicating that this is good to go? And if that first commit message needs to be updated, please also leave the updated commit message here as a comment. Thanks!

ximinez avatar Oct 20 '25 17:10 ximinez

Thank you @ximinez and all involved people for approving, contributing and helping getting this PR merged.

Also from my side, this is good to go.

The initial commit message looks good, so imho no changes/adjustments needed.

treeol avatar Oct 20 '25 21:10 treeol