NOT_FOUND: beacon state at slot
Description
After updating to v8.0.1, I encounter a "NOT FOUND" error when trying to check the balance of a slot from the previous day. I haven't modified any config values. Are there any additional config values that need to be added?
curl post /eth/v1/beacon/states/13092597/validator_balances with ["0x..."]
Version
v8.0.1
Present Behaviour
{"code":404,"message":"NOT_FOUND: beacon state at slot 13092597","stacktraces":[]}
Expected Behaviour
Return normal response with balances.
Steps to resolve
State pruning has been enabled by default for a while, this shouldn't have changed in v8 unless you were updating from like... v5. If you need historical states you need to opt-in to storing them. The easiest thing to do is run an archive node to keep all states (needs around 500GB total): https://lighthouse-book.sigmaprime.io/advanced_checkpoint_sync.html#how-to-run-an-archived-node
You can also do some tricks to just store more recent states: https://lighthouse-book.sigmaprime.io/faq.html#bn-partial-history
I am using --reconstruct-historic-states options now..but the state is not found.
@MinwooJ Can you share output from curl -s http://localhost:5052/lighthouse/database/info | jq?
It could be that state reconstruction hasn't finished. Maybe you checkpoint synced recently and lost the DB?
State reconstruction takes about a week
{
"schema_version": 28,
"config": {
"block_cache_size": 0,
"state_cache_size": 128,
"state_cache_headroom": 1,
"compression_level": 1,
"historic_state_cache_size": 1,
"cold_hdiff_buffer_cache_size": 16,
"hot_hdiff_buffer_cache_size": 1,
"compact_on_init": false,
"compact_on_prune": true,
"prune_payloads": true,
"backend": "LevelDb",
"hierarchy_config": {
"exponents": [
5,
9,
11,
13,
16,
18,
21
]
},
"prune_blobs": true,
"epochs_per_blob_prune": 256,
"blob_prune_margin_epochs": 0
},
"split": {
"slot": "13101632",
"state_root": "0x422df43d8d9c8389a44cad0f9fd84e510740e4d627c4d9fa6c345d837618b6bb",
"block_root": "0x966848eac4eb70b3d97c30975c5c2ccba88690a501914275bb193e3f6f190ea4"
},
"anchor": {
"anchor_slot": "13060672",
"oldest_block_slot": "9653696",
"oldest_block_parent": "0xe7f40110077a9dbb94d82a97cb792349a3ba9781833a498b154759ddf02924cf",
"state_upper_limit": "14680064",
"state_lower_limit": "0"
},
"blob_info": {
"oldest_blob_slot": "12964128",
"blobs_db": true
}
}
Yes, I have resynced recently. This is what I got. How can I check reconstruction is done or not? I could not notice reconstructing is not done from the log.
Based on the document you shared, I understand that archiving occurs for slots after the state-upper-limit (a multiple of 221). Currently, it is set to 14,680,064 (221 * 7), but I think setting it to 12,582,912 (2**21 * 6) might work for me. Is there a way to configure this state-upper-limit value?
Is there a way to configure this state-upper-limit value?
Yeah, read the section on partial history: https://lighthouse-book.sigmaprime.io/faq.html#bn-partial-history. You need to download that state and its block from somewhere (a public endpoint should work), and then sync from it while using the flag --reconstruct-historic-states. In this case the flag just acts to tell Lighthouse to not immediately begin pruning.
Yes, I have resynced recently. This is what I got. How can I check reconstruction is done or not? I could not notice reconstructing is not done from the log.
In the logs right now you should see Downloading historical blocks. Once that finishes, it would start logging messages like State reconstruction in progress. You can speed up the block backfill with --disable-backfill-rate-limiting, although this can lead to performance degradation if you don't have enough IOPS.
Thank you for your response. I have one more question.
After re-syncing Lighthouse, does archiving start from the point when it is synced to the latest slot, or does it begin only after the reconstruct is completed? I'm asking because even a slot from an hour ago is showing "NOT FOUND."
Did you resync from the 2^21-aligned snapshot? It sounds like you might have checkpoint synced again from a more recent state, which won't result in any additional states being available (unless you do full reconstruction, which starts from slot 0).
In fact "reconstruction" always starts from slot 0. However when reconstruction is enabled it causes the "anchor" to be initialised with a state upper limit equal to the next snapshot (multiple of 2^21) on or after the checkpoint. This has the effect of retaining states after that point (the state upper limit) while syncing forwards to the head. If you look at /lighthouse/database/info you can see what the state upper limit is set to. If it's a slot in the future then your node doesn't have any historical states. If you checkpoint sync correctly from that past state with a slot that is multiple of 2^21 you should see that the state upper limit is equal to that multiple of 2^21. You will have to wait for forwards sync to sync forward to the head in this case though, which could take multiple days.
Hope that helps!
@michaelsproul Thank you for your answers.
Did you resync from the 2^21-aligned snapshot?
No, I just used "--checkpoint-sync-url=https://beaconstate.ethstaker.cc" and "--reconstruct-historic-states" options. How can I set 2^21-aligned snapshot? Which option should be set? I would like to try with 2^21-aligned snapshot for quick partial archiving.
Hi @MinwooJ, you can download the checkpoint state and block from my server here:
- http://sproul.xyz/eth2/slot_12582912/block_12582912.ssz
- http://sproul.xyz/eth2/slot_12582912/state_12582912.ssz
Then provide these with --checkpoint-state state_12582912.ssz --checkpoint-block block_12582912.ssz --reconstruct-historic-states flags. Unfortunately I couldn't find the blobs for this slot anywhere, so I've made a branch here to make Lighthouse skip the blob check:
- https://github.com/sigp/lighthouse/pull/8470
It should be merged to unstable soon. Then you can build the unstable branch, if you prefer not to build my branch.
Usually, you could download these states from a public archive node like QuickNode. Nimbus also provide public nodes here:
https://github.com/status-im/nimbus-eth2/?tab=readme-ov-file#quickly-test-your-tooling-against-nimbus
@michaelsproul After "Downloading historical blocks" is completed, I still receive a NOT_FOUND response. Do I need to wait until "State reconstruction in progress slot: 6685184, remaining: 6480223" is also completed?
Do I need to wait until "State reconstruction in progress slot: 6685184, remaining: 6480223" is also completed?
Yes. Keep an eye on the state_lower_limit in the /lighthouse/database/info response.
@michaelsproul Thank you for the answer! What should the state_lower_limit be set to? Also, is there a way to estimate how long the state reconstruction in progress will take?
What should the state_lower_limit be set to?
It should be climbing as your node reconstructs more states.
Also, is there a way to estimate how long the state reconstruction in progress will take?
You can measure the time between two State reconstruction in progress logs and calculate based on that. Lighthouse doesn't include an estimate.