Safe non-finalized checkpoint sync
Issue Addressed
Allows Lighthouse to bootstrap into a state for a checkpoint that is not finalized. This feature could save Ethereum Mainnet when shit hits the fan (= long period of non-finality).
Why we can't just checkpoint sync into a non-finalized checkpoint today?
- You can't import any new block into fork-choice
- You can't propose blocks
- You can't sync
Why, and how to solve it? Keep reading :)
Proposed Changes
Let's consider a node that wants to bootstrap from Checkpoint A.
The node has 3 checkpoints of interest, and we will use the following naming conventions:
- Finalized checkpoint: The network's finalized checkpoint or
finalized_checkpoint.on_chain() - Justified checkpoint: The network's justified checkpoint or
justified_checkpoint.on_chain() - Checkpoint A: The checkpoint sync checkpoint, or anchor state, or local irreversible checkpoint matching both
finalized_checkpoint.local()andjustified_checkpoint.local()
Different parts of Lighthouse want to use either the network checkpoints (on_chain) or the local view of the node (local). To force consumers to think about which one they want the fork-choice now exposes ForkChoiceCheckpoint instead of just Checkpoint
pub enum ForkChoiceCheckpoint {
Local {
local: Checkpoint,
on_chain: Checkpoint,
},
OnChain(Checkpoint),
}
The most relevant places where we use this checkpoints are
| Item | Checkpoint to use |
|---|---|
| Gossip verification, reject objects older than the finalized checkpoint | local |
| Fork-choice irreversible checkpoint, reject blocks that do not descend from this checkpoint | local |
| Fork-choice filter tree function, only heads that descend from this block are viable | local |
| Fork-choice filter tree function, only heads with correct finalization and justification | on_chain |
| Casper FFG votes, source checkpoint | on_chain |
| Status message | on_chain |
| Sync forward range sync start epoch | local |
Beacon HTTP API finalized + justified tag |
on_chain |
Let me justify each one. For unstable every item would use local and I'll explain why that breaks.
Gossip verification, reject objects older than the finalized checkpoint
We can't import blocks or objects that don't descend from our anchor state because we don't have the pre-states. We need to use local since we may not have the on_chain finalized state available.
Fork-choice irreversible checkpoint, reject blocks that do not descend from this checkpoint
Same as above, reject blocks that conflict with our local irreversible checkpoint
Fork-choice filter tree function, only heads that descend from this block are viable
While we could use the on_chain justified checkpoint here we don't have its ProtoNode available. To reduce the diff in the fork-choice code, we use the local one. However it's always true that justified_checkpoint.local will equal or be a descendant of justified_checkpoint.on_chain
Fork-choice filter tree function, only heads with correct finalization and justification
Our ProtoNode objects track the finalized and justified checkpoints of their states. Those are the on_chain ones, so to make those blocks viable we need to compare against on_chain checkpoints. Otherwise we end up with a fork-choice that looks like this where all nodes imported after the anchor block are not viable. Note that the block at slot 643 has non-matching justified and finalized checkpoint.
dump of debug/fork-choice on unstable
{
"justified_checkpoint": {
"epoch": "20",
"root": "0x44a053199e37647e4dd6a21ad2def14d17e0af13ea9ba9d467a3dc99fad817a2"
},
"finalized_checkpoint": {
"epoch": "20",
"root": "0x44a053199e37647e4dd6a21ad2def14d17e0af13ea9ba9d467a3dc99fad817a2"
},
"fork_choice_nodes": [
{
"slot": "640",
"block_root": "0x44a053199e37647e4dd6a21ad2def14d17e0af13ea9ba9d467a3dc99fad817a2",
"parent_root": null,
"justified_epoch": "20",
"finalized_epoch": "20",
"weight": "1574400000000",
"validity": "valid",
"execution_block_hash": "0xa668211691c40797287c258d328cec77d0edd12956eebd5c64440adb489eaca0"
},
{
"slot": "643",
"block_root": "0x8b9b01aadc0f7ead90ed7043da3cedfb17a967a7552a9d45dfcd5ce4cc4ae073",
"parent_root": "0x44a053199e37647e4dd6a21ad2def14d17e0af13ea9ba9d467a3dc99fad817a2",
"justified_epoch": "16",
"finalized_epoch": "15",
"weight": "1574400000000",
"validity": "optimistic",
"execution_block_hash": "0xdc9799a99f3e07da0b45ce04586bbcb999910907bea8c86c68e3faf03b878f86"
},
Casper FFG votes, source checkpoint
We must use the on_chain ones to prevent surround votes and for our votes to be includable on-chain.
Status message
It's best to tell other nodes that we share the same view of finality. Otherwise they look to use like being "behind" and the may try to fetch from us a finalized chain that doesn't finalize at that epoch.
Sync forward range sync start epoch
Range sync assumes that the split slot == finalized slot. But we only need to sync blocks descendant of the split slot (= anchor slot).
Other changes
LH currently has a manual finalization mechanism via a HTTP API call. It triggers a special store finalization routine and forces gossip to filter by split slot too. I have changed the manual finalization route to advance the fork-choice local irreversible checkpoint and run the regular finalization routine. Overall looks simpler and more maintainable.
Test of ff431acdc704fb46b025d4b687d2e5637b039f65
Started a kurtosis network with 6 participants on latest unstable
participants:
- el_type: geth
el_image: ethereum/client-go:latest
cl_type: lighthouse
cl_image: sigp/lighthouse:latest-unstable
cl_extra_params:
- --target-peers=7
vc_extra_params:
- --use-long-timeouts
- --long-timeouts-multiplier=3
count: 6
validator_count: 16
network_params:
electra_fork_epoch: 0
seconds_per_slot: 3
genesis_delay: 400
global_log_level: debug
snooper_enabled: false
additional_services:
- dora
- spamoor
- prometheus_grafana
- tempo
Let the network run for ~15 epochs and stopped 50% of the validators. Let the network run in non finality for many epochs, and started a docker build of this branch
version: "3.9"
services:
cl-lighthouse-syncer:
image: "sigp/lighthouse:non-fin"
command: >
lighthouse beacon_node
--debug-level=debug
--datadir=/data/lighthouse/beacon-data
--listen-address=0.0.0.0
--port=9000
--http
--http-address=0.0.0.0
--http-port=4000
--disable-packet-filter
--execution-endpoints=http://172.16.0.89:8551
--jwt-secrets=/jwt/jwtsecret
--suggested-fee-recipient=0x8943545177806ED17B9F23F0a21ee5948eCaa776
--disable-enr-auto-update
--enr-address=172.16.0.21
--enr-tcp-port=9000
--enr-udp-port=9000
--enr-quic-port=9001
--quic-port=9001
--metrics
--metrics-address=0.0.0.0
--metrics-allow-origin=*
--metrics-port=5054
--enable-private-discovery
--testnet-dir=/network-configs
--boot-nodes=enr:-OK4QHWvCmwiaEj8437Z6Wlk32gLVM5Hbw9n6PesII42toDOPmdquevxog8OS8SMMru3VjRvo3qOk80qCXzOUEa8ecoDh2F0dG5ldHOIAAAAAAAAAMCGY2xpZW501opMaWdodGhvdXNlijguMC4wLXJjLjKEZXRoMpASBoYoYAAAOP__________gmlkgnY0gmlwhKwQABKEcXVpY4IjKYlzZWNwMjU2azGhAmc8-hvS_9yO5fBwlBhgTYVDSdOtFJW7uVpTmYkVcZBWiHN5bmNuZXRzAIN0Y3CCIyiDdWRwgiMo
--target-peers=3
--execution-timeout-multiplier=3
--checkpoint-block=/blocks/block_640.ssz
--checkpoint-state=/blocks/state_640.ssz
--checkpoint-blobs=/blocks/blobs_640.ssz
environment:
- RUST_BACKTRACE=full
extra_hosts:
# Allow container to reach host service (Linux-compatible)
- "host.docker.internal:host-gateway"
volumes:
- configs:/network-configs
- jwt:/jwt
- /root/kurtosis-non-fin/blocks:/blocks
ports:
- "33400:4000/tcp" # HTTP API
- "33554:5054/tcp" # Metrics
- "33900:9000/tcp" # Libp2p TCP
- "33900:9000/udp" # Libp2p UDP
- "33901:9001/udp" # QUIC
networks:
kt:
ipv4_address: 172.16.0.88
shm_size: "64m"
lcli-mock-el:
image: sigp/lcli
command: >
lcli mock-el
--listen-address 0.0.0.0
--listen-port 8551
--jwt-output-path=/jwt/jwtsecret
volumes:
- jwt:/jwt
ports:
- "33851:8551"
networks:
kt:
ipv4_address: 172.16.0.89
networks:
kt:
external: true
name: kt-quiet-crater
volumes:
configs:
name: files-artifact-expansion--e56f64e9c6aa4409b27b11e37d1ab4d3--bc0964a8b6c54745ba6473aaa684a81e
external: true
jwt:
name: files-artifact-expansion--870bc5edd3eb44598f50a70ada54cd31--bc0964a8b6c54745ba6473aaa684a81e
external: true
All logs below are from the cl-lighthouse-syncer container
The node starts with checkpoint sync at a more recent checkpoint than latest finalized, specifically epoch 20. The node range synced to head without issues. See log Synced and finalized_checkpoint: 0xc1edeaf0997ead34936cf20372084f0348ffb79366d453c19f2ef0a1536e766a/15/local/0x44a053199e37647e4dd6a21ad2def14d17e0af13ea9ba9d467a3dc99fad817a2/20
Then I triggered manual finalization into a more recent non-finalized checkpoint. It now logs finalized_checkpoint: 0xc1edeaf0997ead34936cf20372084f0348ffb79366d453c19f2ef0a1536e766a/15/local/0xe155e7846c20f1db50d53e9257c9eaa48c07dc5426312aded75be46672c3d022/727
Oct 25 13:42:24.501 INFO Synced peers: "3", exec_hash: "0x1ec25768c49daf9edf41a93a23ce3c2419c0412d016a0a32b135922947dd91a0 (unverified)", finalized_checkpoint: 0xc1edeaf0997ead34936cf20372084f0348ffb79366d453c19f2ef0a1536e766a/15/local/0x44a053199e37647e4dd6a21ad2def14d17e0af13ea9ba9d467a3dc99fad817a2/20, epoch: 732, block: "0x2e69210f7caba54c127b79b1959ab5fdb9fe04948365ce4a46d44e1f103fd7e1", slot: 23439
Oct 25 13:42:27.127 DEBUG Processed HTTP API request elapsed_ms: 63.17743300000001, status: 200 OK, path: /lighthouse/finalize, method: POST
Oct 25 13:42:27.501 INFO Synced peers: "3", exec_hash: "0x1ec25768c49daf9edf41a93a23ce3c2419c0412d016a0a32b135922947dd91a0 (unverified)", finalized_checkpoint: 0xc1edeaf0997ead34936cf20372084f0348ffb79366d453c19f2ef0a1536e766a/15/local/0xe155e7846c20f1db50d53e9257c9eaa48c07dc5426312aded75be46672c3d022/727, epoch: 732, block: " … empty", slot: 23440
Oct 25 13:42:27.668 DEBUG Starting database pruning split_prior_to_migration: Split { slot: Slot(640), state_root: 0x890ac4381ba8306416f7cc2c8af52d7af0292b5a5b78e429921c98d917256d1b, block_root: 0x44a053199e37647e4dd6a21ad
2def14d17e0af13ea9ba9d467a3dc99fad817a2 }, new_finalized_checkpoint: Checkpoint { epoch: Epoch(727), root: 0xe155e7846c20f1db50d53e9257c9eaa48c07dc5426312aded75be46672c3d022 }, new_finalized_state_root: 0x61dd07935440fe95b50a5bad21760d8ae96cc
58439ba6047b5c41caa89ee06f6
Oct 25 13:42:27.878 DEBUG Extra pruning information new_finalized_checkpoint: Checkpoint { epoch: Epoch(727), root: 0xe155e7846c20f1db50d53e9257c9eaa48c07dc5426312aded75be46672c3d022 }, new_finalized_state_root: 0x61dd0793
5440fe95b50a5bad21760d8ae96cc58439ba6047b5c41caa89ee06f6, split_prior_to_migration: Split { slot: Slot(640), state_root: 0x890ac4381ba8306416f7cc2c8af52d7af0292b5a5b78e429921c98d917256d1b, block_root: 0x44a053199e37647e4dd6a21ad2def14d17e0af1
3ea9ba9d467a3dc99fad817a2 }, newly_finalized_blocks: 22625, newly_finalized_state_roots: 22625, newly_finalized_states_min_slot: 640, required_finalized_diff_state_slots: [Slot(23264), Slot(23040), Slot(22528), Slot(16384), Slot(640)], kept_s
ummaries_for_hdiff: [(0x890ac4381ba8306416f7cc2c8af52d7af0292b5a5b78e429921c98d917256d1b, Slot(640)), (0x77de49b559e30ac0ffd9f1fdff87c19a57803fc8036b62f387dcc55885a83f47, Slot(16384)), (0xa99ffc63658d2cd51b8d0f9e59caa755f376cd46e1d08ecc0acc9f
f77eab4192, Slot(22528)), (0x5d8a9d02d3f88d1c8a51f8ed533933e593bee96f335756336bbc654c4a898542, Slot(23040))], state_summaries_count: 22989, state_summaries_dag_roots: [(0x890ac4381ba8306416f7cc2c8af52d7af0292b5a5b78e429921c98d917256d1b, DAGSt
ateSummary { slot: Slot(640), latest_block_root: 0x44a053199e37647e4dd6a21ad2def14d17e0af13ea9ba9d467a3dc99fad817a2, latest_block_slot: Slot(640), previous_state_root: 0xc210447f3b63eca6d76073d7f647a22a069b6c69c0dd275a723f2ed8acf566fd })], fi
nalized_and_descendant_state_roots_of_finalized_checkpoint: 277, blocks_to_prune: 0, states_to_prune: 22708
Oct 25 13:42:28.162 DEBUG Database pruning complete new_finalized_state_root: 0x61dd07935440fe95b50a5bad21760d8ae96cc58439ba6047b5c41caa89ee06f6
Oct 25 13:42:28.164 INFO Starting database compaction old_finalized_epoch: 20, new_finalized_epoch: 727
Then I restarted the validators and the network finalized. See that the node pruned from the latest manual finalization. It now logs finalized_checkpoint: 0x13700c4b236867eb7bf1a6752fe6729118cfcf5d70f5bc1916e70ecea542d01a/736
Oct 25 13:51:14.585 DEBUG Starting database pruning split_prior_to_migration: Split { slot: Slot(23264), state_root: 0x61dd07935440fe95b50a5bad21760d8ae96cc58439ba6047b5c41caa89ee06f6, block_root: 0xe155e7846c20f1db50d53e9257c9eaa48c07dc5426312aded75be46672c3d022 }, new_finalized_checkpoint: Checkpoint { epoch: Epoch(736), root: 0x13700c4b236867eb7bf1a6752fe6729118cfcf5d70f5bc1916e70ecea542d01a }, new_finalized_state_root: 0x8de7833fb5b0a73211982b32bfa5c1e7b79154386150a04d6be60afd62b92988
qOct 25 13:53:51.500 INFO Synced peers: "3", exec_hash: "0x875c7aa7b85bd792a7842858a34c507e70f5eb05af80cfe170c891cacbe15c19 (unverified)", finalized_checkpoint: 0x13700c4b236867eb7bf1a6752fe6729118cfcf5d70f5bc1916e70ecea542d01a/736, epoch: 739, block: "0x70782ea183f9c5edfbf9709e9728b2a8c775b2f72e70b9e715f1d815cc236471", slot: 23668
Then I stopped 50% of the validators again and triggered manual finalization. See that it transitions from finalized_checkpoint: 0x7c51f4c364f60561e9b39931264b54c7656dd74fb99c5a048c09c16126bb70ce/740 to finalized_checkpoint: 0x7c51f4c364f60561e9b39931264b54c7656dd74fb99c5a048c09c16126bb70ce/740/local/0xff5d2b08ebaf5bec69c1e9251d9ad788c6350e0a6ece69a27b13fa477af76729/746
Oct 25 14:09:24.500 INFO Synced peers: "3", exec_hash: "0xa106339a1ccf80a10e1c5cdd2b50e8d5c58cb87f66c2574527d054d710886a5c (unverified)", finalized_checkpoint: 0x7c51f4c364f60561e9b39931264b54c7656dd74fb99c5a048c09c16126bb70ce/740, epoch: 749, block: " … empty", slot: 23979
Oct 25 14:09:26.157 DEBUG Processed HTTP API request elapsed_ms: 0.6780459999999999, status: 200 OK, path: /lighthouse/finalize, method: POST
Oct 25 14:09:27.111 DEBUG Starting database pruning split_prior_to_migration: Split { slot: Slot(23680), state_root: 0xce62a2b2dd6f00d9aacb3469050dcd1d1035e84457adba3d85f6142be3a2013c, block_root: 0x7c51f4c364f60561e9b39931264b54c7656dd74fb99c5a048c09c16126bb70ce }, new_finalized_checkpoint: Checkpoint { epoch: Epoch(746), root: 0xff5d2b08ebaf5bec69c1e9251d9ad788c6350e0a6ece69a27b13fa477af76729 }, new_finalized_state_root: 0x77c39c0504220d7a563a9403e25a05ccf46d7294c4282787f564559a191a6f2d
Oct 25 14:09:27.504 INFO Synced peers: "3", exec_hash: "0xa106339a1ccf80a10e1c5cdd2b50e8d5c58cb87f66c2574527d054d710886a5c (unverified)", finalized_checkpoint: 0x7c51f4c364f60561e9b39931264b54c7656dd74fb99c5a048c09c16126bb70ce/740/local/0xff5d2b08ebaf5bec69c1e9251d9ad788c6350e0a6ece69a27b13fa477af76729/746, epoch: 749, block: " … empty", slot: 23980
Restart the validators, and the network finalizes again
Oct 25 14:21:39.500 INFO Synced peers: "3", exec_hash: "0x2ed485a642ad02a2f11208c8af1aafbc7f2e961344fb47424c61d64fee71f43e (unverified)", finalized_checkpoint: 0x67160358278de47d9d7a4c88bd07391c7470edc4a431047145d4edd183de7915/755, epoch: 757, block: "0xca984edf928634d9f7460b1cfe64ed73b80f99d8ed8cebda2a752e9a6cb3c995", slot: 24224
Notes: I tested manual finalization and checkpoint syncing into only blocks that are first in epoch. In prior tests using non-aligned blocks broke, and I still don't know the reason
Test failure is caused by this:
#[serde(skip)]
/// The `genesis` field is not serialized or deserialized by `serde` to ensure it is defined
/// via the CLI at runtime, instead of from a configuration file saved to disk.
pub genesis: ClientGenesis,
I'm going to get rid of that skip and the comment, seeing as it isn't relevant while we don't have config files, and it just makes this feature impossible to test.
Skipping it doesn't provide any safety either: it just means it gets initialized to Default::default().
Holesky tests
TODO
New node non-finalized checkpoint sync
Choose a recent epoch. Compute slot = the first slot of epoch. Query the beacon API /eth/v1/beacon/states/${slot}/root = state_root.
Find a synced Holesky node with an exposed beacon API that is reachable from the node you are running from the test. Run the node with these additional flags:
--checkpoint-sync-url=http://beacon.node
--checkpoint-sync-state-id=$state_root
Now the notifier should log a string similar to this where the finalized_checkpoint log field includes the word "local" and the epoch we choose above.
Oct 25 14:09:27.504 INFO Synced peers: "3", exec_hash: "0xa106339a1ccf80a10e1c5cdd2b50e8d5c58cb87f66c2574527d054d710886a5c (unverified)", finalized_checkpoint: 0x7c51f4c364f60561e9b39931264b54c7656dd74fb99c5a048c09c16126bb70ce/740/local/0xff5d2b08ebaf5bec69c1e9251d9ad788c6350e0a6ece69a27b13fa477af76729/746, epoch: 749, block: " … empty", slot: 23980
Existing node manual finalization
Find a recent checkpoint. Choose a recent epoch. Check if the first slot of that epoch was not missed. Otherwise find the latest block of previous epochs, let's note that block block. Then find the block_root of block, by querying the beacon API for that block's slot.
Now trigger manual finalization against a running node with (replace the variables we just computed):
curl http://beacon.node:4000/lighthouse/finalize -H "Content-Type: application/json" --data '{"epoch": "${epoch}", "block_root": "${block_root}"}'
Now the notifier should log a string similar to this where the finalized_checkpoint log field includes the word "local" and block_root we computed above.
Oct 25 14:09:27.504 INFO Synced peers: "3", exec_hash: "0xa106339a1ccf80a10e1c5cdd2b50e8d5c58cb87f66c2574527d054d710886a5c (unverified)", finalized_checkpoint: 0x7c51f4c364f60561e9b39931264b54c7656dd74fb99c5a048c09c16126bb70ce/740/local/0xff5d2b08ebaf5bec69c1e9251d9ad788c6350e0a6ece69a27b13fa477af76729/746, epoch: 749, block: " … empty", slot: 23980
Some required checks have failed. Could you please take a look @dapplion? 🙏
We need to test this in Diamond before merging
Holesky tests
I did some testing using the commit: Lighthouse/v8.0.1-381c8a3
New node checkpoint sync
For a new node, start the beacon node with the flags:
--checkpoint-sync-state-id 0x9f37ffa731049fd876bb374c9a41a39c96dd94ad27d74287c5a1c34be242035d \
--checkpoint-sync-url https://bn:5052 \
I can see the logs:
INFO Synced peers: "4", exec_hash: "0xaf2c9c3891a8465caa9e26bcdc88ba0d87bd8b17ec04782577318a2cd8bb9249 (unverified)", finalized_checkpoint: 0xffda251733659448537ee8adfd096f47aabc21e45f257afa22b749311987efc5/171015/local/0x1f044b59b434783de5519eadbb2999905fb77cad8d9673795110b0352ea00f3c/181245, epoch: 181251, block: "0x4cc99461ec7ecd610ba2d196e026972068ad932cdbf2d4d2d20d35677b9ac92c", slot: 5800044
which looks good. I tested a few times using different checkpoint states and it mostly works, except one time that it shows invalid execution payload as below:
(I couldn't reproduce it though, even though I retry using the same checkpoint slot 5796416, the first slot of epoch 181138: https://light-holesky.beaconcha.in/epoch/181138)
Dec 11 14:48:54.001 INFO Syncing peers: "1", distance: "128 slots (25 mins)", est_time: "--"
Dec 11 14:49:01.516 WARN Invalid execution payload validation_error: Some("Invalid block without parent: InvalidTxSignature: Signature is invalid.."), latest_valid_hash: None, execution_block_hash: 0x0862fa9d21bca83d9532b44ef65f8e856b50b78191e1fb26be1a7a0e8f13c8b9, root: 0x0e9c36455b757ee048b208c223c24c78e52751172b9518ca874d096a79af92ab, graffiti: "RH7435LHf1f2", proposer_index: 597165, slot: 5796418, method: "new_payload"
Nethermind logs at the same time:
Dec 11 14:53:44 Ubuntu-2404-noble-amd64-base nethermind[510028]: 11 Dec 14:53:44 | Received New Block: 4811535 (0x0862fa...13c8b9) | limit 60,000,000 | Extra Data: reth/v1.9.2/linux
Dec 11 14:53:44 Ubuntu-2404-noble-amd64-base nethermind[510028]: 11 Dec 14:53:44 | Rejected invalid block 4811535 (0x0862fa9d21bca83d9532b44ef65f8e856b50b78191e1fb26be1a7a0e8f13c8b9), ExtraData: reth/v1.9.2/linux, reason: orphaned block is invalid
~~For an existing node, I am not sure if I am doing it correctly. I use the node running using manual finalization as above. In this example, the node is manual finalized at epoch 181245. I get a newer checkpoint and run (e.g., epoch 181247):~~
curl https://bn:5052/lighthouse/finalize -H "Content-Type: application/json" --data '{"state_root": "0x87f4630f1b706e1e8d6c386eb739cb27b0005bea44057252cdb4e3f46fe70946", "epoch": "181247", "block_root": "0xde2579b6ab9aa599cdb08a56bfde875c9ee4e0ef97d57af8c7a506682863987a"}' | jq
~~It shows the output data after the query, but I don't see the log changes, the Synced log is still showing 181245 for finalized_checkpoint field. Is this expected? I would have thought it will be updated to 181247.~~
See comment below
Manual finalization
I used the wrong node in the above, so now I understand how to do it correctly. Trigger manual finalization using the command:
curl http://localhost:5052/lighthouse/finalize -H "Content-Type: application/json" --data '{"state_root": "0x1bf6cb99dabded92d05f61990d619df425d7c7958eaeb02226bfc03e7e23b0e0", "epoch": "182160", "block_root": "0x09f183b1762f1ea8c998e7690f8ad5e64e64c76a2152e4559edc07a4c5bec8ec"}' | jq
I can see
DEBUG Processed HTTP API request elapsed_ms: 3427.99746, status: 200 OK, path: /lighthouse/finalize, method: POST
DEBUG Extra pruning information new_finalized_checkpoint: Checkpoint { epoch: Epoch(182160)
DEBUG Extra pruning information new_finalized_checkpoint: Checkpoint { epoch: Epoch(182160)
DEBUG Pruning block new_finalized_state_root: 0x1bf6cb99dabded92d05f61990d619df425d7c7958eaeb02226bfc03e7e23b0e0
DEBUG Pruning hot state new_finalized_state_root: 0x1bf6cb99dabded92d05f61990d619df425d7c7958eaeb02226bfc03e7e23b0e0
which looks great. The log also shows updated manual finalized epoch as: Synced ...../182160. I try a few manual finalizations for forward-epoch and they are good.
Should we add a check to prevent triggering manual finalization to a state that's prior to the epoch? For example, the node is triggered to manually finalized at epoch 182160. If one is to trigger another manual finalization checkpoint before 182160, should we prevent that? Because right now, if I trigger at a prior epoch 182154, it can process the request. We see in the log:
CRIT Error updating finalization error: MissingBeaconState(0xaed7be8f200bcf9a0f75297988cbe3ecf2b001c2d0cf749bf7f5de34f47175bf), slot: 5829518
INFO Synced peers: "30", exec_hash: "0x6eb711e75251e8560a246283b43a51f1c6d028254570c2020949bb5093870c8e (unverified)", finalized_checkpoint: 0xffda251733659448537ee8adfd096f47aabc21e45f257afa22b749311987efc5/171015/local/0x2b064ddc02e80e9cfcaf1f2a282fb328ce1334aae11fdfb85609e8bfa7c5bbf9/182154
although it says error updating finalization, the epoch apparently has been updated to 182154 from the log.
Should we add a check to prevent triggering manual finalization to a state that's prior to the epoch?
The state_root in the query is actually not used anymore in the latest version. I removed the state_root from the query and updated the test instructions