go-spacemesh icon indicating copy to clipboard operation
go-spacemesh copied to clipboard

Blocks received too early cause API to report syncedLayer that's too high

Open lrettig opened this issue 4 years ago • 4 comments

My node has been returning 863 for the syncedLayer all day, despite the fact that the verified and top layers are lower. This hasn't changed even as the top and verified layers have increased. This could be due to a bug in #2299.

Seeing this on tn127/v0.1.28.

{
  "status": {
    "connectedPeers": "8",
    "isSynced": true,
    "syncedLayer": {
      "number": 863
    },
    "topLayer": {
      "number": 608
    },
    "verifiedLayer": {
      "number": 607
    }
  }
}

lrettig avatar Apr 06 '21 20:04 lrettig

It turns out that this is not an API issue after all. I can't explain how, but there are two smeshers (the IDs are 5b74c and f638f) that began generating and sending valid blocks 24 hrs early in this testnet. They aren't managed miners and they aren't the nodes I'm running. Maybe the system clocks are fast by 24 hrs. (go-spacemesh would not allow this, as it performs an NTP check every so often, but in theory someone could've removed this check from the code.)

There's no check or protection in the code against valid early blocks. So the nodes that receive them just store them, and update the latest layer (Mesh.SetLatestLayer) using the layer for that block. We need to discuss how to handle this adversarial case. It shouldn't be possible to have a valid ATX, and to be able to produce seemingly-valid blocks, an epoch before everyone else.

You can see the pattern very clearly in this query: http://199.223.235.142:5601/app/kibana#/discover?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'2021-04-05T06:12:26.063Z',to:'2021-04-06T23:00:00.000Z'))&_a=(columns:!(name,sm.L,sm.M,sm.N,sm.block_id,sm.epoch_id,sm.layer_id,sm.miner_id),filters:!(),index:sm,interval:auto,query:(language:kuery,query:'sm.M:%22got%20new%20block%22'),sort:!())

lrettig avatar Apr 06 '21 23:04 lrettig

Looks like also the 127 explorer BE is getting 862 as current layer from the backing nodes api.

avive avatar Apr 07 '21 07:04 avive

@lrettig still relevant?

moshababo avatar Jun 26 '22 13:06 moshababo

I think so - we need to test this as part of adversarial testing (early blocks). CC @dshulyak

lrettig avatar Jun 27 '22 21:06 lrettig

there is another issue for this

dshulyak avatar Sep 03 '23 08:09 dshulyak