go-spacemesh
go-spacemesh copied to clipboard
Blocks received too early cause API to report syncedLayer that's too high
My node has been returning 863 for the syncedLayer all day, despite the fact that the verified and top layers are lower. This hasn't changed even as the top and verified layers have increased. This could be due to a bug in #2299.
Seeing this on tn127/v0.1.28.
{
"status": {
"connectedPeers": "8",
"isSynced": true,
"syncedLayer": {
"number": 863
},
"topLayer": {
"number": 608
},
"verifiedLayer": {
"number": 607
}
}
}
It turns out that this is not an API issue after all. I can't explain how, but there are two smeshers (the IDs are 5b74c and f638f) that began generating and sending valid blocks 24 hrs early in this testnet. They aren't managed miners and they aren't the nodes I'm running. Maybe the system clocks are fast by 24 hrs. (go-spacemesh would not allow this, as it performs an NTP check every so often, but in theory someone could've removed this check from the code.)
There's no check or protection in the code against valid early blocks. So the nodes that receive them just store them, and update the latest layer (Mesh.SetLatestLayer) using the layer for that block. We need to discuss how to handle this adversarial case. It shouldn't be possible to have a valid ATX, and to be able to produce seemingly-valid blocks, an epoch before everyone else.
You can see the pattern very clearly in this query: http://199.223.235.142:5601/app/kibana#/discover?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'2021-04-05T06:12:26.063Z',to:'2021-04-06T23:00:00.000Z'))&_a=(columns:!(name,sm.L,sm.M,sm.N,sm.block_id,sm.epoch_id,sm.layer_id,sm.miner_id),filters:!(),index:sm,interval:auto,query:(language:kuery,query:'sm.M:%22got%20new%20block%22'),sort:!())
Looks like also the 127 explorer BE is getting 862 as current layer from the backing nodes api.
@lrettig still relevant?
I think so - we need to test this as part of adversarial testing (early blocks). CC @dshulyak
there is another issue for this