go-spacemesh icon indicating copy to clipboard operation
go-spacemesh copied to clipboard

Record ATX timeliness, don't reward late ATXs

Open lrettig opened this issue 3 years ago • 6 comments

Description

Verifying tortoise and self-healing rely upon knowing whether ATXs (referenced by voting blocks) were received on time, in order to properly assign weights to votes. (See #2540.) In order to know whether ATXs were timely (i.e., received on time), we also need to know whether they were received over gossip or sync. (ATXs received over sync should always be considered timely. Those received over gossip are only timely if they were received before the start of the following, target epoch.)

There's an easy and a hard way we might try to accomplish this.

  • easy: decide that, for ATXs received via sync, we just set the timestamp to nil (or to some other predetermined token value), and always consider these on time (do we ever need to know when an ATX received via sync was received?)
  • hard: add a new bit somewhere in the atxdb to store this information

Either way this will require fairly extensive changes to the way ATXs are received, processed, and stored in the DB (currently this happens as part of block syntactic validation, and the code paths are the same for blocks received via gossip and via sync).

Affected code

code related to receiving ATXs via gossip and sync

This issue appears in commit hash: activation/atxdb.go

Related files (optionally with line numbers):

activation/atxdb.go layerfetcher/layers.go

lrettig avatar Jul 10 '21 23:07 lrettig

I don't understand what we gain from this distinction. As I see it there are only two things a node can know about when an ATX was published:

  1. I've seen this ATX myself before the deadline.
  2. I know nothing (I could have a slow connection, my neighbors could be malicious, the ATX was published late, etc.).

Late gossiping vs. syncing can be easily manipulated by malicious nodes. It mixes layers of abstraction in ways that we'll curse for years for no apparent benefit.

I vote to close this issue.

noamnelke avatar Jul 11 '21 10:07 noamnelke

According to @tal-m,

For verifying tortoise... we don't count votes with an incorrect beacon value (i.e., different from ours for that epoch) or votes with ATXs that weren't received on time (unless we're syncing, in which case we expect everything to be late). So those just have weight 0.

This is the important part:

unless we're syncing, in which case we expect everything to be late

To the extent that we care about whether ATXs were received on time (for purposes of weighting tortoise votes, or anything else), I think we do need to record this information. Otherwise, in case 2. that you describe, there is no way of differentiating between an ATX that was published and/or received late (while we were otherwise online and fully synced), and should therefore be given a weight of zero, and one that was received while syncing and should therefore be given the benefit of the doubt (and a nonzero weight).

lrettig avatar Jul 11 '21 15:07 lrettig

@lane: Mea culpa! I was wrong and @noamnelke is correct.

The ATXs on-time status shouldn't matter for block validation, only for determining the rewarded blocks. Otherwise, honest parties could continue to disagree for an entire epoch about the vote counts. (We still need to postpone votes for incorrect beacon values, however.)

For reward purposes, we do need a bit for each ATX specifying whether it's in the active set for the current epoch (this bit can be discarded once the epoch is over). This bit is used in computing the Hare input: honest parties should vote against blocks corresponding to late ATXs.

As @noamnelke points out, we shouldn't care whether ATXs were received over gossip or via sync as this could be gameable by an adversary, and also differs between honest parties. However, if we weren't online at the last epoch boundary, we don't have verifiable information about how to set this bit (we receive everything "late", since we're syncing in the middle of an epoch).

There are several ways to deal with this:

  1. Easy, not very secure: sync the bit along with the ATXs, trusting the peer from which we receive the ATX.
  2. Slightly more complex: sync the bit, but take a majority among connected peers.
  3. Even more complex, but probably the "right" way to do it: use a pointwise-majority of the active sets committed by the first cluster of blocks in the epoch (cluster means enough layers to have >800 blocks). That is, since the first block each ID generates in an epoch has to point to an active set, we can interpret this as a vote on the "good" ATXs as seen by that ID. If all honest parties saw an ATX as good, it will be part of all their active sets, and due to the honest majority assumption will pass the pointwise majority test. If all honest parties saw an ATX as late (or didn't see it), the majority of blocks will "vote" against this ATX, so it won't make it into the active set.

I think option (1) is sufficient for now, and probably even for mainnet launch; the potential severity of the attack depends on the total weight of honest parties that join mid-epoch---if this weight is small, then an adversary can't really do anything (we would consider the newly-joined parties that were successfully attacked as adversarial for the purpose of reward distribution, but if their weight combined with the adversary is still less than 1/3, we're ok).

tal-m avatar Jul 11 '21 21:07 tal-m

@noamnelke close?

moshababo avatar Jun 26 '22 13:06 moshababo

Cleaned this up a bit. Let's revisit as part of adversarial testing. I want to keep it open for now until @tal-m's concerns are addressed. Noam says "I have a half-baked proposal for how to incentivize ATX publication throughout the epoch, and I think it can possibly fix Tal’s concern here"

lrettig avatar Aug 09 '22 18:08 lrettig

@lrettig why did you add the Hare Protocol tag?

jonZlotnik avatar Aug 19 '22 17:08 jonZlotnik

This should be baked into the new design we end up with ensuring consensus on Hare active set. See @tal-m's proposal here: https://community.spacemesh.io/t/grading-atxs-for-the-active-set/335

We may get this "for free"

CC @countvonzero @selfdual-brain

lrettig avatar Jan 26 '23 22:01 lrettig