Use Validator List (VL) cache files in more scenarios
High Level Overview of Change
- If any
[validator_list_keys]are not available after all[validator_list_sites]have had a chance to be queried, then fall back to loading cache files. Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely). - Resolves #5320
Context of Change
Validator list cache files are only used if the request to a site fails. That doesn't cover enough possible cases.
Type of Change
- [X] Bug fix (non-breaking change which fixes an issue)
- [X] New feature (non-breaking change which adds functionality)
API Impact
None
Before / After
Makes the following changes:
- If a validator site request times out (
ValidatorSite::onRequestTimeout), and there is not yet alastRefreshStatus, it sets one indicating the timeout. - In
ValidatorSite::setTimer, which determines which request to send next, if all of the sites have alastRefreshStatus, callsmissingSite.-
missingSiteis unchanged. It callsValidatorList::loadLists, which returns all the cache file names for lists ([validator_list_keys]) which are unavailable, and for which a cache file exists. Those file names are then passed toValidatorSite::load, which adds them to thesites_list. Because those "sites" are new, they will be tried next.
-
-
ValidatorSite::loadchecks for duplicate URIs. Since it's avector, it's not very efficient, but this function is only called at startup and viamissingSites, so it won't be called often, and not in any critical path.
Test Plan
Reproduce the scenario from #5320. Verify that the UNL becomes available within a few seconds of startup.
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 79.5%. Comparing base (92281a4) to head (f78c633).
Additional details and impacted files
@@ Coverage Diff @@
## develop #5323 +/- ##
=======================================
Coverage 79.5% 79.5%
=======================================
Files 816 816
Lines 72180 72191 +11
Branches 8295 8264 -31
=======================================
+ Hits 57364 57387 +23
+ Misses 14816 14804 -12
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/xrpld/app/misc/detail/ValidatorSite.cpp | 93.9% <100.0%> (+1.4%) |
:arrow_up: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
I updated my validators.txt to include:
[validator_list_sites]
# https://vl.ripple.com/
https://vl.xrplf.org/
https://vl.xrplf.org/
[validator_list_keys]
# Ripple
ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734
# XRPLF
ED45D1840EE724BE327ABE9146503D5848EFD5F38B6D5FEDE71E80ACCE5E6E738B
Notice that it includes the keys for both Ripple and XRPLF, but the URL for Ripple is commented out, and the URL for XRPLF is duplicated.
$ rippled -q validator_list_sites
{
"result" : {
"status" : "success",
"validator_sites" : [
{
"last_refresh_status" : "same_sequence",
"last_refresh_time" : "2025-Feb-26 22:40:44.2977547 UTC",
"next_refresh_time" : "2025-Feb-26 22:45:43.7251469 UTC",
"refresh_interval_min" : 5,
"uri" : "https://vl.xrplf.org/"
},
{
"last_refresh_status" : "accepted",
"last_refresh_time" : "2025-Feb-26 22:25:44.4012575 UTC",
"next_refresh_time" : "2025-Feb-27 22:25:44.4017257 UTC",
"refresh_interval_min" : 1440,
"uri" : "file://[...]/cache.ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734"
}
]
}
}
Note that there are two sites listed here: XRPLF, and the local cache.
$ rippled -q validators
{
"result" : {
"publisher_lists" : [
{
"available" : true,
"expiration" : "2026-Jan-17 10:09:21.0000000 UTC",
"list" : [
[...]
],
"pubkey_publisher" : "ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734",
"seq" : 81,
"uri" : "file://[...]/cache.ED2677ABFFD1B33AC6FBC3062B71F1E8397C1505E1C42C64D11AD1B28FF73F4734",
"version" : 2
},
{
"available" : true,
"expiration" : "2026-Jan-18 00:00:00.0000000 UTC",
"list" : [
[...]
],
"pubkey_publisher" : "ED45D1840EE724BE327ABE9146503D5848EFD5F38B6D5FEDE71E80ACCE5E6E738B",
"seq" : 2025011701,
"uri" : "https://vl.xrplf.org/",
"version" : 1
},
[...]
Note that both lists are available, and that both indicate the same uri as shown in validator_list_sites (which does not necessarily need to be the case).
Notice that it includes the keys for both Ripple and XRPLF, but the URL for Ripple is commented out, and the URL for XRPLF is duplicated.
What is the scenario that is tested here? Ripple's publisher list was cached at a previous point in time. Presently, that publisher list is unavailable (due to being commented out in the config file). Your changes to the code have resorted to using the cache to load the ValidatorList. Is this an accurate description of the PR?
In what other situations are you going to be using the cache?
What is the scenario that is tested here?
It's demonstrating two things:
- That the
[validator_list_sites]and[validator_list_keys]are independent. This was already true. - That if all the sites included in
[validator_list_sites]are queried and any of the[validator_list_keys]still do not have a valid UNL available, the cache file will be loaded. As noted in the PR description, the original behavior will not load the cache in that scenario. "Currently, cache files are only used if no sites are defined, or the request to one of them has an error. It does not include cases where not enough sites are defined, or if a site returns an invalid VL (or something else entirely)."
Ripple's publisher list was cached at a previous point in time. Presently, that publisher list is unavailable (due to being commented out in the config file). Your changes to the code have resorted to using the cache to load the ValidatorList. Is this an accurate description of the PR?
Yep.
In what other situations are you going to be using the cache?
"Currently, cache files are only used if no sites are defined, or the request to one of them has an error."
Since this PR has been sitting for so long, I'm thinking that it should be expanded to simply always try to load the VL cache files on start up. There are two main reasons.
- The node can immediately start listening for trusted validations - it doesn't even have to wait for the web download to finish.
- It reduces the risk of processing of old VLs because the node will know about the most recent VL it wrote to a cache.