ouroboros-network
ouroboros-network copied to clipboard
Leader VRF value no longer settling ties
I do not know whether or not we've found a bug or just discovered the consequences of an intentional decision.
The final tie breaker for "slot battles" is the leader VRF value:
https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-protocol/src/Ouroboros/Consensus/Protocol/Praos/Common.hs#L62-L68
In the TPraos
protocol (used prior to the Vasil HF), csvLeaderVRF
was the leader VRF value. In the Praos
protocol, however, csvLeaderVRF
is being set to the single VRF value in the block header (prior to the range extension).
This removes a small advantage that small pools previously enjoyed. Small pools are more likely to win this tie breaker, since by being a small pool they need a smaller leader VRF value in order to win the leader check. Using the the VRF value before the range extension is applied removes this small advantage.
The Evidence:
The view, PraosChainSelectView
, is populated by the BlockSupportsProtocol
class method selectView
, which uses the ProtocolHeaderSupportsProtocol
class method pHeaderVRFValue
to set csvLeaderVRF
in the view.
- In
TPraos
,pHeaderVRFValue
uses the leader VRF value: https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Protocol/TPraos.hs#L111 - In
Praos
,pHeaderVRFValue
uses the raw header VRF value: https://github.com/input-output-hk/ouroboros-network/blob/9249a70ed9e2365f3963e47cb31b4b1589bca8f6/ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Protocol/Praos.hs#L141
This was discover here: https://github.com/cardano-community/cncli/issues/19
In case it was an intentional decision (to randomise slot battle outcomes) the following is worth noting:
- Leader VRF value is something independent of block contents.
- Whereas (I assume) block VRF depends on block contents.
Therefore the way things are currently enables stake pools to game the system by running custom software to re-order, or select transactions, in order to generate a low block VRF result to maximise their chances of winning a potential slot battle.
It is probably not good to have an extra incentive for block producers to manipulate transactions.
It is probably not good to have an extra incentive for block producers to manipulate transactions.
you are absolutely correct @TerminadaPool , and we've made sure not to incentivize this kind of behavior.
Whereas (I assume) block VRF depends on block contents.
it does not! (to your point above). the value of the VRF in the block header depends on:
- the epoch nonce
- the slot
- the private VRF key
Hi, I appreciate it if you can elaborate on why in these 2 cases CCYT lost 2 blocks to a pool with low VRF after the VHF, is it completely random now?
https://pooltool.io/realtime/7825147 https://pooltool.io/realtime/7700005
@HT-Moh Yes, it is currently completely random without any slight advantage to smaller pools as was the case in Alonzo. This issue was opened to see if the change was intentional or unintentional. If unintentional, it can be filed as a bug and corrected in the node.
@AndrewWestberg thanks
My 2 cents: this was known for a while by the community and nobody ever said it was a bug. I always assumed that it was a design decision since for a small pool losing a slot battle is very noticeable versus a bigger pool.
Here is some documentation about it near bottom of page ( that website is down right now, so linking wayback machine URL ) https://web.archive.org/web/20220822194606/https://cardanofaq.com/books/general-operations/page/what-are-slot-battles
I think you might be missing the point @reqlez. Slot battles used to be determined in favour of the lowest VRF score (which would favour smaller pools), but now the determination is random. Small pools no longer have the slot battle advantage your link refers to.
I think you might be missing the point @reqlez. Slot battles used to be determined in favour of the lowest VRF score (which would favour smaller pools), but now the determination is random. Small pools no longer have the slot battle advantage your link refers to.
i'm fully aware. i'm just saying, that i believe the lowest VRF score preference ( how it was before ) was a design decision versus a bug. But I do not have any information from IOG that would suggest that, of course. The reason i believe that, is because i feel it makes sense to give smaller pools a preference, because a lost block is not very noticeable to a big pool, but would hurt a smaller pool quite a bit. What I do not know, is what are the potential security issues with this. I mean... it worked like this for 2 years, and seemed fine. Has anybody found a way to abuse this reliably?
It was a deliberate design feature to determine slot battles in preference of the lower VRF score.
But the changes in the most recent version 1.35.3, resulted in the removal of this feature. This was why I was surprised to notice the change and the rest of the community was also surprised. The truth hasn't come out yet about whether the removal of this feature was intentional or not.
Here is some more context that everyone may not be aware of.
At the Vasil hard fork, we switched from using the TPraos
ledger protocol ("transitional Praos") to Praos
. The TPraos
block headers contained two VRF values: one for the leader check and one for the contribution to the next epoch nonce. The VRF check is a bit costly, and so our cryptographers came up with an optimization, namely the range extension. We made use of this optimization in Praos
, and hence the Praos
block headers only include one VRF value, and we use the range extension on this single value to produce the two values that we need (leader nonce and entropy nonce).
TPraos
was settling "slot battles" with the leader nonce in the block header. Praos
is settling ties with the single nonce in the block header. The question that I've raised in this issue is: would there be any negative consequences to instead settling these ties in Praos
by range extending the single VRF value in the block header to the leader value.
Do we have any ETA for this fix? This should have higher priority. As it already affecting the ratings and ROI % of smaller pools. In my personal experience I already lost 5 substantial delegators to large pools. Thank you.
@cardanoinvest I assume that a fix for this will be unlikely to be rolled out separately. This is because the larger stake pools will have no incentive to upgrade since it will slightly disadvantage them. If only part of the network upgrades then slot battles will become non-deterministic (since upgraded nodes nodes will award a different winner half of the time) resulting in more chain forks needing to be settled by longest chain rule.
Yes, it is a blow for small pools especially when the mandatory 340 Ada min fee is also working against them recruiting delegation.
@TerminadaPool if they fix and release it as 1.35.4 with high recommendation to update and send note to exchanges to update, I believe it will become main version. Because 1.35.5 , 1.35.6 will come anyway with that fix implemented. Not like we have any other major alternative Cardano node development team.
@cardanoinvest If you update your node and the majority doesn't then your node might produce a block on the non-consensus fork causing your block to become orphaned.
This only affects slot battles. Whether someone is running a 1.35.4 with the patch or 1.35.3 without won't make much difference as only 5% of blocks are battles. What matters is whether the node making the block AFTER yours is upgraded. You cannot reduce or improve your chances by being on the same or different version of the node coming after your block. All you can do is upgrade yourself to try and be nice to any smaller pool making a block before yours. Whether your block gets adopted or not is out of your hands.
Agreed. But wouldn't this sort of fix better fit a CUE event anyway? In that case the upgrade discussion would be moot no?
This is indeed hurting small pool operator's pool performance. Since this directly affects decentralization of Cardano network, we should definitely address this issue. In my opinion, this should have higher priority than the minpool fee debate.
There is a misconception that small pools decentralize. Making slot battle random instead of favoring small pools is an incentive in the right direction, to consolidate pools. A single pool with 30MM stake is better than 10 pools with 3MM stake.
We need to stop incentivizing thousands of pools with low stake and begin incentivizing K desirable pools.
There is a misconception that small pools decentralize. Making slot battle random instead of favoring small pools is an incentive in the right direction, to consolidate pools. A single pool with 30MM stake is better than 10 pools with 3MM stake.
We need to stop incentivizing thousands of pools with low stake and begin incentivizing K desirable pools.
Could you please explain this misconception? You are saying a concentration of resources in fewer places helps with decentralization??
You know the pools in the above photo are part of a large group because they are marketed that way. If each of those pools had different metadata, website, etc. they would still be part of a group. Not all groups are so blatant. Assuming that any pool with low pledge is independent because it's marketed that way is naive. Many people (especially ITN OGs) are running tons of pools with low pledge and low stake to farm minPoolCost. Slot battle preference paired with giant minPoolCost are two incentives for SPO to run many small pools instead of one large pool.
By making slot battles random you incentives delegators and operators to consolidate their stake into one large pool instead of dozens of small pools.
Well if they splitting into multiple small pools they still have to pay for the resources, CPU/RAM etc, so those 340 should cover it. They still make it decentralized
I've now had a chance to talk to the researchers, the cryptographers, and the folks that did the implementation work.
It was always intended that ties be settled uniformly random. The behavior in Praos is expected, the behavior in TPraos is not.
The Praos paper does not make assumptions about how tie-breaking is done, and is assumed to be controlled by the adversary. The problem is that this is an unintended incentive mechanism. This incentive was not intentionally added To TPraos, nor was it intentionally removed from Praos. The assumption was always that ties were being resolved fairly.
@kaskjabhdlf is right to remind us that we cannot equivocate "good for small pools" with "better for decentralization". Though @cardanoinvest is almost certainly correct that the small pool advantage is dwarfed by other advantages, the fact is that the incentive mechanism that was designed by our research already takes into account the fact that we want decentralized block production.
I'm going to close this issue now, not because I want to end the discussion, but because my original question has been answered. Please feel free to keep the discussion going here, elsewhere on GitHub, or Discord, etc.
@JaredCorduan Regardless of intentions, "uniformly random" does not have uniform consequences. The impact of slot battles already negatively impacts a small pool much more than a large pool. It's a huge hit to ROI numbers for a small pool to lose even a single block. For a pool near saturation, it's insignificant.
I believe we will see decentralization decrease and fewer new pools being able to compete with established pools. I would ask you to re-consider the decision to not fix it, otherwise we will likely see a our first community-supported fork of the IOG code.
@AndrewWestberg I guess we all can manage to gather large enough community to make that fork, even @CharlesHoskinson "RATS" pools is small enough to be negatively affected by this. I would like to hear his comments on that issue.
It was always intended that ties be settled uniformly random.
@JaredCorduan - wouldn't that incentivise those operators to create more adversarial forks (as in run multiple BP nodes - was evident in ITN as chain selection was primary reason for those being run) to try and get an advantage? It's not gonna be uniformly random if there are ways to tilt (even if not guarantee) results into a favour by creating forks.
I believe we will see decentralization decrease and fewer new pools being able to compete with established pools. I would ask you to re-consider the decision to not fix it,
I will re-open this issue if y'all like (just let me know), but my intention with opening it to begin with was to get to the bottom of why the change happened. According to the folks who are responsible for settling ties with the leader nonce, there is nothing to fix anymore. If the community likes the accidental behavior better, that's worth discussing too.
wouldn't that incentivise those operators to create more adversarial forks (as in run multiple BP nodes - was evident in ITN as chain selection was primary reason for those being run) to try and get an advantage? It's not gonna be uniformly random if there are ways to tilt (even if not guarantee) results into a favour by creating forks.
multiple BP nodes with the same keys? that won't help, that doesn't change how consensus chooses between forks. or maybe you mean splitting up your pool into multiple pools? it's not clear that would give you enough of an advantage to be worth it.
Let me just say that I'm not the person that anyone needs to convince one way or the other, I'm not a subject expert here.
@JaredCorduan Thank you for raising that issue. I feel like there is a very little information about it in the community, we have small confused SPOs wondering where their only block went. And developers changed that parameters without any vote from the community, without any prior notification. And that parameter drastically changes the play field.
I don't have any good contacts in Cardano community, so I ask you and others, please make it more loud. And can we also get to Charles to comment on that thing.
You know the pools in the above photo are part of a large group because they are marketed that way. If each of those pools had different metadata, website, etc. they would still be part of a group. Not all groups are so blatant. Assuming that any pool with low pledge is independent because it's marketed that way is naive. Many people (especially ITN OGs) are running tons of pools with low pledge and low stake to farm minPoolCost. Slot battle preference paired with giant minPoolCost are two incentives for SPO to run many small pools instead of one large pool.
By making slot battles random you incentives delegators and operators to consolidate their stake into one large pool instead of dozens of small pools.
I was just asking for an explanation of the “misconception” you declared above.
Incentives aside, I still see no valid explanation for how making things more centralized could make the network more decentralized. Please explain that to me.
You seem to assume that the majority of small pools are owned by only a few operators. Even if that was true (which I do not believe at all), it would still provide a larger network with more decentralization- no?
multiple BP nodes with the same keys? that won't help, that doesn't change how consensus chooses between forks.
When results for chain election are random, if 2 BPs from same pool are running (for instance with different tx set due to upstream peers) , for their own block - that would mean the battle of A vs B block now becomes A vs B vs C (where A and C are for same slot from same pool, but different block hashes - this was common in ITN and the incentives to do so were curbed in mainnet due to the rule of preferring smaller pools.
The researchers (seems blackbox) did approve of the same back when change was made, see here
@rdlrt This is not the case. It's now using the single vrf value from the block. I've been calling it the block_vrf, but it really has nothing to do with the contents of the block. It's just using a random value from an earlier step of the slot leader selection instead of the final step which is the leader_vrf. As long as you're using the same pool keys, two pools (dual leaders) will produce the same vrf value. It's this vrf value that is then hashed again to get the leader_vrf.