Remove `TriggerHardForkAtEpoch`
Due to #395, we have good evidence that no use case is fundamentally/crucially relying on TriggerHardForkAtEpoch. Hence, we would like to remove them to simplify the possible trajectories of the HFC (also see this HWW section).
Goals of this ticket:
- Remove the
TriggerHardForkAtEpochconstructor, and adapt documentation accordingly. - Introduce a different way for users to specify that they wish to directly skip to a later era. (The current interface/configuration options could be kept as is for a migration period.)
- Provide guidance to migrate away from the remaining use cases of
TriggerHardForkAtEpochmentioned in #395.
Today, we discussed the following idea that should supersede TriggerXXXHardForkAtEpoch (where XXX is an era): Introducing configuration that would cause the ledger to increase the (major) protocol version to a specific value.
For example, the node .yaml file could contain configuration like
ProtocolVersionUpdates:
- epoch: 5
protocolVersion: 10
and the Ledger would hence update the major protocol version to 10 in epoch 5[^1].
This way, "scripted" era transitions on a testnet (e.g. Babbage at protocol version 8 for the first two epochs, then Conway at protocol version 9 for two epochs, then Conway at protocol version 10 for Chang+1 which unlocks new Plutus functionalities) are easily possible without actually having to submit any governance-related transactions. In particular, intra-era HFs (like to Chang+1 in the example) are not possible with the current TriggerHardForkAtEpoch mechanism.
The downside of this approach is that it makes Consensus/the HFC simpler at the expense of making Ledger more complex, so this definitely requires further thought/discussion whether this pays its weight.
[^1]: This requires a bit of thought re what should happen if the protocol version is already larger than 10, or whether this update should take precedence over a regular governance-induced update of the protocol version.
Edit by @nfrisby: the footnote that Esgen wrote above is not accidental complexity; it's unavoidable complexity of the functionality people are incorrectly assuming that TriggerHardForkAtEpoch provides.
I'd summarize the state of https://github.com/IntersectMBO/ouroboros-consensus/issues/395 and this 416 Issue as follows.
-
See Issue 395 for reports of how this version is being used: mostly to skip eras in epoch 0, but also a few scheduled transitions in epoch 1 or 2 eg.
-
The Slack thread https://input-output-rnd.slack.com/archives/CFKLUH4R0/p1731494758341649 additionally demonstrates an example where someone was confused that the
TriggerHardForkAtEpochflags in the node configuration caused an era transition but did not increment the major protocol version. Especially due to the logic of intra-era hard forks branching on the protocol version itself, this strongly indicates to me that it's entirely non-sensical for the HFC to break the synchronization of the ledger's era with the protocol version protocol parameter. -
So I see three options.
- The HFC mucks with the ledger's state to update the protocol version parameter when transitioning due to a given
TriggerHardForkAtEpochconfiguration. (I do not like this; it's upside down.) - We upstream the
TriggerHardForkAtEpochinterpretation into the Ledger itself instead of the HFC. (I would like this, but Ledger will probably hesitate.) (This is the previous comment on this 416 Issue.) - The downstream users that want to schedule transitions at certain epochs no longer get to do so via configuration data and must instead submit the necessary governance transactions at the necessary time. (I can't imagine the downstream users will like this, but perhaps Ledger and Consensus can try to make the necessary preparations as painless as possible.) (IE the description of this 416 Issue itself.)
Edit: There's a misnomer involved here. The Hard Fork Combinator does not handle every hard fork 🤦, due to intra-era hard forks. Obvious in hind-sight: it's the maintenance of the major protocol version that is fundamental, not the HFC's current era.
I opened https://github.com/IntersectMBO/cardano-ledger/issues/4898 to prompt the Ledger to weigh-in on the above comment ^^^.
After a round of discussions of Alexey on https://github.com/IntersectMBO/cardano-ledger/issues/4898, he and I developed a plan. But when we proposed it to the known users of TestXxxHardForkAtEpoch within IOG, they all replied in a way that actually suggests the simplest possible option for Ledger and Consensus is viable: we remove the feature entirely and they use on-chain governance instead --- my "I can't imagine" above was wrong.
New plan: I'll draft a CIP with migration instructions.
Prior to a CIP, I'm writing it up for "internal" users (who are plausibly the only users) on this PR https://github.com/IntersectMBO/ouroboros-consensus/pull/1408.
While doing so, I finally understood enough of the detailed dynamics (thanks to an explanation from Esgen) to realize that this task ...
Introduce a different way for users to specify that they wish to directly skip to a later era. (The current interface/configuration options could be kept as is for a migration period.)
... is merely a flag indicating whether to skip Byron.
For the Cardano ledger, at least, the table maintained alongside CIP-0059 is not optional. The Ledger rules themselves assume that eg Alonzo can only run with version 5 or 6.
- If
SkipByronis absent or not set, then the Shelley Genesis file must containprotocolVersion: {major: 2, minor: 0}and the Cardano node should initialize in Byron with protocol version 0. (IE this is whatmainnetdid/requires.) - If
SkipByronis set, then the Shelley Genesis file's protocol version must be >=2.0, and the Cardano node should skip to the corresponding era at the onset of slot 0, as determined by the CIP-0059 table.
(Edit: it's a somewhat different question to determine what the general HFC interface for skipping eras should be, but for Cardano specifically the above seems natural. And still would even if the Ledger Team changed/parameterized their definitions so that eg Alonzo could start with any protocol version: skipping is for tests, and it seems least confusing for the protvers on the testnet to have the same meaning as those on the corresponding mainnet.)
That looks great, reduced opportunity for inconsistent configuration! In particular, on the long-running preview testnet which skips to a later era (preprod does not do that), this logic will do the right thing, as it is configured to skip to Alonzo on slot/epoch 0 (src), and uses 6 as its major prot ver in the Shelley genesis file (src), and 6 is one of the two versions associated with Alonzo.
I just now:
- merged https://github.com/IntersectMBO/ouroboros-consensus/pull/1408 ie the
EraTransitionGovernance.mddocument - opened https://github.com/IntersectMBO/cardano-node-tests/issues/2927 in
cardano-node-testswith reference to theEraTransitionGovernance.mddocument - opened https://github.com/IntersectMBO/cardano-db-sync/issues/1952 in
cardano-db-syncwith reference to theEraTransitionGovernance.mddocument
As far as I know, when those two issues are done, we can actually remove support for AtEpoch: >0 in the HFC.
I set this to status "Help needed" in the Consensus Team Backlog, since we're now blocked on those downstream Issues.