go-spacemesh
go-spacemesh copied to clipboard
Allow tortoise active set size to decay more quickly
Description
Right now, if many miners disappear for any reason, self-healing ensures that consensus will eventually be re-established and tortoise will begin verifying layers again, but this may take a very long time. If it happens early in epoch N, then tortoise will be stuck for the rest of epoch N, as well as epoch N+1 (since most of the miners active in epoch N probably already submitted ATXs for N+1), and won't be able to heal until epoch N+2, when the size of the active set is finally reduced. With two-week epochs, this means that, in an extreme case, tortoise could be stuck for as long as a month.
Rather than waiting this long with a sudden cliff at the end of epoch N+1, we could cause the size of the active set (total miner weight) to decay exponentially, allowing the vote threshold to be crossed more quickly so that consensus can heal. This is a tradeoff between safety and liveness. It should decay gradually at first, then more quickly if the situation continues.
CC @tal-m
Affected code
Verifying tortoise, self-healing
Right now, if many miners disappear for any reason, self-healing ensures that consensus will eventually be re-established and tortoise will begin verifying layers again, but this may take a very long time.
if majority of miners disappear forever - self-healing will not ensure this. as there will be always shortage of weight, and global threshold won't be crossed.
@dshulyak still relevant?
there will be always shortage of weight, and global threshold won't be crossed
I don't think this is true. As I pointed out above, total weight will go down in subsequent epochs, and thresholds will eventually be crossed again (but it will take a while).
this issue can be transferred to research repository or to the forum research topic as the design isn't anywhere near ready for implementation
i will close it, if it comes up later in research discussions we will consider it again