fabric single orderer network : if the orderer is restarted, it becomes a follower and is not starting an election for the orderer to become a leader

Description

SETUP

Hyperledger Fabric version: 2.2
Consensus: RAFT
Blockchain network: 1
Organizations:

2 [each org is on a different Azure kubernetes cluster]
Each Org is on a separate channel (so 2 channels, these are named cefiuschannel & ibnpublic)
Each Org has 1 peer each

Orderers: 1

ISSUE ON HAND

We were trying to update the blockchain certificate before its expiry for which we did the following steps: a. We increased the validity of the new Orderer certificates from 1 year to 3 years. b. Restarted the Certificate Authority (CA) c. Restarted the Orderer POD

Post this we expected the following to happen:

Orderer should have successfully restarted and started an election which would make it the Leader (as there is only 1 Orderer).
Post this the certificates update commands should've been successfully executed.

However, we are experiencing the following:

Orderer does restart successfully BUT it immediately becomes a Follower without even attempting an election
The logs show the following:

2023-04-10 05:29:46.377 UTC 076a INFO [orderer.common.cluster] Configure -> Entering, channel: cefiuschannel, nodes: [] 2023-04-10 05:29:46.377 UTC 076b INFO [orderer.common.cluster] Configure -> Exiting 2023-04-10 05:29:46.377 UTC 076c DEBU [orderer.consensus.etcdraft] start -> Starting raft node: #peers: 1 channel=cefiuschannel node=1 2023-04-10 05:29:46.377 UTC 076d INFO [orderer.consensus.etcdraft] start -> Starting raft node to join an existing channel channel=cefiuschannel node=1 2023-04-10 05:29:46.377 UTC 076e INFO [orderer.consensus.etcdraft] becomeFollower -> 1 became follower at term 0 channel=cefiuschannel node=1 2023-04-10 05:29:46.377 UTC 076f INFO [orderer.consensus.etcdraft] newRaft -> newRaft 1 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0] channel=cefiuschannel node=1 2023-04-10 05:29:46.377 UTC 0770 INFO [orderer.consensus.etcdraft] becomeFollower -> 1 became follower at term 1 channel=cefiuschannel node=1 2023-04-10 05:29:46.377 UTC 0771 INFO [orderer.consensus.etcdraft] Start -> Starting Raft node channel=ibnpublic node=1 2023-04-10 05:29:46.377 UTC 0772 INFO [orderer.common.cluster] Configure -> Entering, channel: ibnpublic, nodes: [] 2023-04-10 05:29:46.377 UTC 0773 INFO [orderer.common.cluster] Configure -> Exiting 2023-04-10 05:29:46.377 UTC 0774 DEBU [orderer.consensus.etcdraft] start -> Starting raft node: #peers: 1 channel=ibnpublic node=1 2023-04-10 05:29:46.378 UTC 0775 INFO [orderer.consensus.etcdraft] start -> Starting raft node to join an existing channel channel=ibnpublic node=1 2023-04-10 05:29:46.378 UTC 0776 INFO [orderer.consensus.etcdraft] becomeFollower -> 1 became follower at term 0 channel=ibnpublic node=1 2023-04-10 05:29:46.378 UTC 0777 INFO [orderer.consensus.etcdraft] newRaft -> newRaft 1 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0] channel=ibnpublic node=1 2023-04-10 05:29:46.378 UTC 0778 INFO [orderer.consensus.etcdraft] becomeFollower -> 1 became follower at term 1 channel=ibnpublic node=1 2023-04-10 05:29:46.428 UTC 0779 INFO [orderer.common.server] Main -> Starting orderer:

Steps to reproduce

Bring up a standard HLF network (version 2.2 with a system channel) with a single orderer on a kubernetes cluster.
Once the network is up, restart the orderer pod by deleting it or by restarting through the deployments.
The orderer becomes a follower in all the channels and fails to start an election.

Attaching the orderer log after the restart.

orderer0-deployment-7b67b8d496-ncccs.log

Apr 11 '23 07:04 Vbhaskar125

Hyperledger Fabric version: 2.2

Why 2.2? Can you try if it works on 2.5?

Apr 11 '23 15:04 yacovm

When you re-issue a certificate of an orderer without a config update it needs to have the same public key, did you use the same public key?

Apr 11 '23 21:04 yacovm

Hyperledger Fabric version: 2.2

Why 2.2? Can you try if it works on 2.5? our network was created an year ago with version 2.2. when we restarted the orderer, it did not become a leader. then we tried using orderer version 2.4. Please correct me if my understanding is wrong, the advantage of using 2.5 versus 2.2 is that we can join application channel without using a system channel.

what we tried (before the certificate expired)

changed the orderer to version 2.4 (Here also it was a follower in all the channels)
we tried to add 2 more orderers (v2.4) to application channels using channel join
here the new orderers were able to sync the data

When you re-issue a certificate of an orderer without a config update it needs to have the same public key, did you use the same public key?

-we were not able to reach this step as we weren't able to download the config block itself

Apr 12 '23 06:04 Vbhaskar125

Please correct me if my understanding is wrong, the advantage of using 2.5 versus 2.2 is that we can join application channel without using a system channel.

But I don't think we backport bug fixes to 2.2 at this time, @denyeart am I wrong?

Apr 12 '23 07:04 yacovm

what we tried (before the certificate expired)

changed the orderer to version 2.4 (Here also it was a follower in all the channels)
we tried to add 2 more orderers (v2.4) to application channels using channel join
here the new orderers were able to sync the data

I don't understand. You had 1 orderer and you added 2 orderers and it worked and now you removed them again?

Apr 12 '23 07:04 yacovm

what we tried (before the certificate expired)
changed the orderer to version 2.4 (Here also it was a follower in all the channels)
we tried to add 2 more orderers (v2.4) to application channels using channel join
here the new orderers were able to sync the data
I don't understand. You had 1 orderer and you added 2 orderers and it worked and now you removed them again?

we had 1 orderer network. after the restart of the orderer, it failed to become a leader.
so at this point, we have a network with no raft leader. Hence, no system/config updates are possible
so we tried bringing up 2 more orderers (orderer1 and orderer2 with version 2.4 image) just to have copy of application chains by joining them to application channels. (at this point the new orderers are still followers and are not part of consenters in the channel config)
then, we also upgraded the orderer0 image to use 2.4 version (hoping that atleast in the application channel it will become a leader)

but the orderer failed to start an election

Apr 12 '23 07:04 Vbhaskar125

Please correct me if my understanding is wrong, the advantage of using 2.5 versus 2.2 is that we can join application channel without using a system channel.

But I don't think we backport bug fixes to 2.2 at this time, @denyeart am I wrong?

Critical fixes will be backported to v2.2 through end of 2023. Users should upgrade to v2.5 this year so that when maintenance of v2.2 ends they will be able to get the future updates and fixes that will be targeted for v2.5.x only.

Apr 12 '23 13:04 denyeart

When you re-issue a certificate of an orderer without a config update it needs to have the same public key, did you use the same public key?

@yacovm even if we use same public key, we need to do configuration update for orderer TLS certificate, right?

Apr 18 '23 17:04 adhavpavan

Of course not. That's the entire idea of using the same public key - no need for a config change!

https://github.com/hyperledger/fabric/pull/1771

Apr 18 '23 18:04 yacovm

fabric fabric copied to clipboard

single orderer network : if the orderer is restarted, it becomes a follower and is not starting an election for the orderer to become a leader

Description

Steps to reproduce

fabric
fabric copied to clipboard