ouroboros-network icon indicating copy to clipboard operation
ouroboros-network copied to clipboard

Leader VRF value no longer settling ties

Open JaredCorduan opened this issue 2 years ago • 47 comments

I do not know whether or not we've found a bug or just discovered the consequences of an intentional decision.


The final tie breaker for "slot battles" is the leader VRF value:

https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-protocol/src/Ouroboros/Consensus/Protocol/Praos/Common.hs#L62-L68

In the TPraos protocol (used prior to the Vasil HF), csvLeaderVRF was the leader VRF value. In the Praos protocol, however, csvLeaderVRF is being set to the single VRF value in the block header (prior to the range extension).

This removes a small advantage that small pools previously enjoyed. Small pools are more likely to win this tie breaker, since by being a small pool they need a smaller leader VRF value in order to win the leader check. Using the the VRF value before the range extension is applied removes this small advantage.


The Evidence:

The view, PraosChainSelectView, is populated by the BlockSupportsProtocol class method selectView, which uses the ProtocolHeaderSupportsProtocol class method pHeaderVRFValue to set csvLeaderVRF in the view.

  • In TPraos, pHeaderVRFValue uses the leader VRF value: https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Protocol/TPraos.hs#L111
  • In Praos, pHeaderVRFValue uses the raw header VRF value: https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Protocol/Praos.hs#L141

This was discover here: https://github.com/cardano-community/cncli/issues/19

JaredCorduan avatar Sep 30 '22 22:09 JaredCorduan

In case it was an intentional decision (to randomise slot battle outcomes) the following is worth noting:

  • Leader VRF value is something independent of block contents.
  • Whereas (I assume) block VRF depends on block contents.

Therefore the way things are currently enables stake pools to game the system by running custom software to re-order, or select transactions, in order to generate a low block VRF result to maximise their chances of winning a potential slot battle.

It is probably not good to have an extra incentive for block producers to manipulate transactions.

TerminadaPool avatar Oct 01 '22 03:10 TerminadaPool

It is probably not good to have an extra incentive for block producers to manipulate transactions.

you are absolutely correct @TerminadaPool , and we've made sure not to incentivize this kind of behavior.

Whereas (I assume) block VRF depends on block contents.

it does not! (to your point above). the value of the VRF in the block header depends on:

  • the epoch nonce
  • the slot
  • the private VRF key

JaredCorduan avatar Oct 01 '22 03:10 JaredCorduan

Hi, I appreciate it if you can elaborate on why in these 2 cases CCYT lost 2 blocks to a pool with low VRF after the VHF, is it completely random now?

https://pooltool.io/realtime/7825147 https://pooltool.io/realtime/7700005

HT-Moh avatar Oct 01 '22 15:10 HT-Moh

@HT-Moh Yes, it is currently completely random without any slight advantage to smaller pools as was the case in Alonzo. This issue was opened to see if the change was intentional or unintentional. If unintentional, it can be filed as a bug and corrected in the node.

AndrewWestberg avatar Oct 02 '22 22:10 AndrewWestberg

@AndrewWestberg thanks

HT-Moh avatar Oct 04 '22 08:10 HT-Moh

My 2 cents: this was known for a while by the community and nobody ever said it was a bug. I always assumed that it was a design decision since for a small pool losing a slot battle is very noticeable versus a bigger pool.

Here is some documentation about it near bottom of page ( that website is down right now, so linking wayback machine URL ) https://web.archive.org/web/20220822194606/https://cardanofaq.com/books/general-operations/page/what-are-slot-battles

reqlez avatar Oct 05 '22 20:10 reqlez

I think you might be missing the point @reqlez. Slot battles used to be determined in favour of the lowest VRF score (which would favour smaller pools), but now the determination is random. Small pools no longer have the slot battle advantage your link refers to.

TerminadaPool avatar Oct 05 '22 21:10 TerminadaPool

I think you might be missing the point @reqlez. Slot battles used to be determined in favour of the lowest VRF score (which would favour smaller pools), but now the determination is random. Small pools no longer have the slot battle advantage your link refers to.

i'm fully aware. i'm just saying, that i believe the lowest VRF score preference ( how it was before ) was a design decision versus a bug. But I do not have any information from IOG that would suggest that, of course. The reason i believe that, is because i feel it makes sense to give smaller pools a preference, because a lost block is not very noticeable to a big pool, but would hurt a smaller pool quite a bit. What I do not know, is what are the potential security issues with this. I mean... it worked like this for 2 years, and seemed fine. Has anybody found a way to abuse this reliably?

reqlez avatar Oct 05 '22 22:10 reqlez

It was a deliberate design feature to determine slot battles in preference of the lower VRF score.

But the changes in the most recent version 1.35.3, resulted in the removal of this feature. This was why I was surprised to notice the change and the rest of the community was also surprised. The truth hasn't come out yet about whether the removal of this feature was intentional or not.

TerminadaPool avatar Oct 05 '22 22:10 TerminadaPool

Here is some more context that everyone may not be aware of.

At the Vasil hard fork, we switched from using the TPraos ledger protocol ("transitional Praos") to Praos. The TPraos block headers contained two VRF values: one for the leader check and one for the contribution to the next epoch nonce. The VRF check is a bit costly, and so our cryptographers came up with an optimization, namely the range extension. We made use of this optimization in Praos, and hence the Praos block headers only include one VRF value, and we use the range extension on this single value to produce the two values that we need (leader nonce and entropy nonce).

TPraos was settling "slot battles" with the leader nonce in the block header. Praos is settling ties with the single nonce in the block header. The question that I've raised in this issue is: would there be any negative consequences to instead settling these ties in Praos by range extending the single VRF value in the block header to the leader value.

JaredCorduan avatar Oct 05 '22 23:10 JaredCorduan

Do we have any ETA for this fix? This should have higher priority. As it already affecting the ratings and ROI % of smaller pools. In my personal experience I already lost 5 substantial delegators to large pools. Thank you.

cardanoinvest avatar Oct 13 '22 08:10 cardanoinvest

@cardanoinvest I assume that a fix for this will be unlikely to be rolled out separately. This is because the larger stake pools will have no incentive to upgrade since it will slightly disadvantage them. If only part of the network upgrades then slot battles will become non-deterministic (since upgraded nodes nodes will award a different winner half of the time) resulting in more chain forks needing to be settled by longest chain rule.

Yes, it is a blow for small pools especially when the mandatory 340 Ada min fee is also working against them recruiting delegation.

TerminadaPool avatar Oct 13 '22 08:10 TerminadaPool

@TerminadaPool if they fix and release it as 1.35.4 with high recommendation to update and send note to exchanges to update, I believe it will become main version. Because 1.35.5 , 1.35.6 will come anyway with that fix implemented. Not like we have any other major alternative Cardano node development team.

cardanoinvest avatar Oct 13 '22 08:10 cardanoinvest

@cardanoinvest If you update your node and the majority doesn't then your node might produce a block on the non-consensus fork causing your block to become orphaned.

TerminadaPool avatar Oct 13 '22 09:10 TerminadaPool

This only affects slot battles. Whether someone is running a 1.35.4 with the patch or 1.35.3 without won't make much difference as only 5% of blocks are battles. What matters is whether the node making the block AFTER yours is upgraded. You cannot reduce or improve your chances by being on the same or different version of the node coming after your block. All you can do is upgrade yourself to try and be nice to any smaller pool making a block before yours. Whether your block gets adopted or not is out of your hands.

AndrewWestberg avatar Oct 13 '22 15:10 AndrewWestberg

Agreed. But wouldn't this sort of fix better fit a CUE event anyway? In that case the upgrade discussion would be moot no?

kiriakos avatar Oct 14 '22 12:10 kiriakos

This is indeed hurting small pool operator's pool performance. Since this directly affects decentralization of Cardano network, we should definitely address this issue. In my opinion, this should have higher priority than the minpool fee debate.

daehan-koreapool avatar Oct 14 '22 19:10 daehan-koreapool

There is a misconception that small pools decentralize. Making slot battle random instead of favoring small pools is an incentive in the right direction, to consolidate pools. A single pool with 30MM stake is better than 10 pools with 3MM stake.

Group

We need to stop incentivizing thousands of pools with low stake and begin incentivizing K desirable pools.

kaskjabhdlf avatar Oct 15 '22 16:10 kaskjabhdlf

There is a misconception that small pools decentralize. Making slot battle random instead of favoring small pools is an incentive in the right direction, to consolidate pools. A single pool with 30MM stake is better than 10 pools with 3MM stake.

We need to stop incentivizing thousands of pools with low stake and begin incentivizing K desirable pools.

Could you please explain this misconception? You are saying a concentration of resources in fewer places helps with decentralization??

gusosborne avatar Oct 16 '22 10:10 gusosborne

You know the pools in the above photo are part of a large group because they are marketed that way. If each of those pools had different metadata, website, etc. they would still be part of a group. Not all groups are so blatant. Assuming that any pool with low pledge is independent because it's marketed that way is naive. Many people (especially ITN OGs) are running tons of pools with low pledge and low stake to farm minPoolCost. Slot battle preference paired with giant minPoolCost are two incentives for SPO to run many small pools instead of one large pool.

By making slot battles random you incentives delegators and operators to consolidate their stake into one large pool instead of dozens of small pools.

kaskjabhdlf avatar Oct 16 '22 13:10 kaskjabhdlf

Well if they splitting into multiple small pools they still have to pay for the resources, CPU/RAM etc, so those 340 should cover it. They still make it decentralized

cardanoinvest avatar Oct 16 '22 13:10 cardanoinvest

I've now had a chance to talk to the researchers, the cryptographers, and the folks that did the implementation work.

It was always intended that ties be settled uniformly random. The behavior in Praos is expected, the behavior in TPraos is not.

The Praos paper does not make assumptions about how tie-breaking is done, and is assumed to be controlled by the adversary. The problem is that this is an unintended incentive mechanism. This incentive was not intentionally added To TPraos, nor was it intentionally removed from Praos. The assumption was always that ties were being resolved fairly.

@kaskjabhdlf is right to remind us that we cannot equivocate "good for small pools" with "better for decentralization". Though @cardanoinvest is almost certainly correct that the small pool advantage is dwarfed by other advantages, the fact is that the incentive mechanism that was designed by our research already takes into account the fact that we want decentralized block production.

I'm going to close this issue now, not because I want to end the discussion, but because my original question has been answered. Please feel free to keep the discussion going here, elsewhere on GitHub, or Discord, etc.

JaredCorduan avatar Oct 16 '22 18:10 JaredCorduan

@JaredCorduan Regardless of intentions, "uniformly random" does not have uniform consequences. The impact of slot battles already negatively impacts a small pool much more than a large pool. It's a huge hit to ROI numbers for a small pool to lose even a single block. For a pool near saturation, it's insignificant.

I believe we will see decentralization decrease and fewer new pools being able to compete with established pools. I would ask you to re-consider the decision to not fix it, otherwise we will likely see a our first community-supported fork of the IOG code.

AndrewWestberg avatar Oct 16 '22 20:10 AndrewWestberg

@AndrewWestberg I guess we all can manage to gather large enough community to make that fork, even @CharlesHoskinson "RATS" pools is small enough to be negatively affected by this. I would like to hear his comments on that issue.

cardanoinvest avatar Oct 16 '22 20:10 cardanoinvest

It was always intended that ties be settled uniformly random.

@JaredCorduan - wouldn't that incentivise those operators to create more adversarial forks (as in run multiple BP nodes - was evident in ITN as chain selection was primary reason for those being run) to try and get an advantage? It's not gonna be uniformly random if there are ways to tilt (even if not guarantee) results into a favour by creating forks.

rdlrt avatar Oct 16 '22 20:10 rdlrt

I believe we will see decentralization decrease and fewer new pools being able to compete with established pools. I would ask you to re-consider the decision to not fix it,

I will re-open this issue if y'all like (just let me know), but my intention with opening it to begin with was to get to the bottom of why the change happened. According to the folks who are responsible for settling ties with the leader nonce, there is nothing to fix anymore. If the community likes the accidental behavior better, that's worth discussing too.

wouldn't that incentivise those operators to create more adversarial forks (as in run multiple BP nodes - was evident in ITN as chain selection was primary reason for those being run) to try and get an advantage? It's not gonna be uniformly random if there are ways to tilt (even if not guarantee) results into a favour by creating forks.

multiple BP nodes with the same keys? that won't help, that doesn't change how consensus chooses between forks. or maybe you mean splitting up your pool into multiple pools? it's not clear that would give you enough of an advantage to be worth it.


Let me just say that I'm not the person that anyone needs to convince one way or the other, I'm not a subject expert here.

JaredCorduan avatar Oct 16 '22 21:10 JaredCorduan

@JaredCorduan Thank you for raising that issue. I feel like there is a very little information about it in the community, we have small confused SPOs wondering where their only block went. And developers changed that parameters without any vote from the community, without any prior notification. And that parameter drastically changes the play field.

I don't have any good contacts in Cardano community, so I ask you and others, please make it more loud. And can we also get to Charles to comment on that thing.

cardanoinvest avatar Oct 16 '22 21:10 cardanoinvest

You know the pools in the above photo are part of a large group because they are marketed that way. If each of those pools had different metadata, website, etc. they would still be part of a group. Not all groups are so blatant. Assuming that any pool with low pledge is independent because it's marketed that way is naive. Many people (especially ITN OGs) are running tons of pools with low pledge and low stake to farm minPoolCost. Slot battle preference paired with giant minPoolCost are two incentives for SPO to run many small pools instead of one large pool.

By making slot battles random you incentives delegators and operators to consolidate their stake into one large pool instead of dozens of small pools.

I was just asking for an explanation of the “misconception” you declared above.

Incentives aside, I still see no valid explanation for how making things more centralized could make the network more decentralized. Please explain that to me.

You seem to assume that the majority of small pools are owned by only a few operators. Even if that was true (which I do not believe at all), it would still provide a larger network with more decentralization- no?

gusosborne avatar Oct 16 '22 21:10 gusosborne

multiple BP nodes with the same keys? that won't help, that doesn't change how consensus chooses between forks.

When results for chain election are random, if 2 BPs from same pool are running (for instance with different tx set due to upstream peers) , for their own block - that would mean the battle of A vs B block now becomes A vs B vs C (where A and C are for same slot from same pool, but different block hashes - this was common in ITN and the incentives to do so were curbed in mainnet due to the rule of preferring smaller pools.

The researchers (seems blackbox) did approve of the same back when change was made, see here

rdlrt avatar Oct 16 '22 21:10 rdlrt

@rdlrt This is not the case. It's now using the single vrf value from the block. I've been calling it the block_vrf, but it really has nothing to do with the contents of the block. It's just using a random value from an earlier step of the slot leader selection instead of the final step which is the leader_vrf. As long as you're using the same pool keys, two pools (dual leaders) will produce the same vrf value. It's this vrf value that is then hashed again to get the leader_vrf.

AndrewWestberg avatar Oct 16 '22 21:10 AndrewWestberg