alluxio icon indicating copy to clipboard operation
alluxio copied to clipboard

New master join cluster and become the leader, the alluxio cluster is not working

Open ccy00808 opened this issue 3 years ago • 8 comments

Alluxio Version: ALL

Describe the bug Note: ratis can automatically add new nodes to the group.

  1. When the master node is replaced and becomes the leader, follower and worker not worker, because they are not able to identify the new leader。
  2. The master address is statically set in conf,After ratis elects the leader successfully, the "alluxio.master.embedded.journal.addresses" in the cluster conf will not be updated
  3. Because follower and worker are not able to identity leader, The following exception will be reported periodically:

2022-01-14 10:19:42,151 WARN RetryHandlingMetaMasterMasterClient - GetId(address=xxxxx:19998) exits with exception [alluxio.exception.status.UnavailableException: Failed to determine address for MetaMasterMaster after 1 attempts] in 120001 ms (>=10000ms) 2022-01-14 10:19:42,151 ERROR MetaMasterSync - Failed to receive leader master heartbeat command.alluxio.exception.status.UnavailableException: Failed to determine address for MetaMasterMaster after 1 attempts at alluxio.AbstractClient.connect(AbstractClient.java:264) at alluxio.AbstractClient.retryRPCInternal(AbstractClient.java:405) at alluxio.AbstractClient.retryRPC(AbstractClient.java:373) at alluxio.AbstractClient.retryRPC(AbstractClient.java:362) at alluxio.master.meta.RetryHandlingMetaMasterMasterClient.getId(RetryHandlingMetaMasterMasterClient.java:81) at alluxio.master.meta.MetaMasterSync.setIdAndRegister(MetaMasterSync.java:115) at alluxio.master.meta.MetaMasterSync.heartbeat(MetaMasterSync.java:71) at alluxio.heartbeat.HeartbeatThread.run(HeartbeatThread.java:119) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:979) 2022-01-14 10:19:42,152 WARN SleepingTimer - Meta Master Sync last execution took 120002 ms. Longer than the interval 120000

To Reproduce

  1. alluxio master was replaced
  2. new master become the leader

Expected behavior Workers and followers can identify the leader normally, and alluxio can provide services normally

Urgency NA

Are you planning to fix it YES

Additional context NA

ccy00808 avatar Feb 24 '22 07:02 ccy00808

@jenoudet @tcrain can you take a look?

HelloHorizon avatar Feb 24 '22 19:02 HelloHorizon

ping @jenoudet

HelloHorizon avatar Feb 28 '22 18:02 HelloHorizon

This is an exception I have seen happen many times, specifically around failovers (between masters of a cluster, or when a new master is added to a cluster). In my experience, it does not affect functionality and is harmless. Have you noticed functionality problems due to this error or simply that the error is written in the logs?

jenoudet avatar Feb 28 '22 19:02 jenoudet

@jenoudet We have many machines that need to be replaced frequently.

  1. Every time a new master node is added, "alluxio.job.master.embedded.journal.addresses" will be updated to the latest cluster configuration. Thus, ratis is able to participate in elections.
  2. The alluxio client request address is written into the conf when it is started. Therefore, the old master node cannot recognize the newly added master node (the address of the new master node is not configured in the old master conf)
  3. When the new master node is elected as the leader, although ratis can work normally, but the alluxio client cannot find the leader, which makes it unable to work

ccy00808 avatar Mar 01 '22 01:03 ccy00808

The mismatch comes from the fact that we currently do not support dynamic configuration propagation for Alluxio master addresses. Ratis can and does take into account new masters, but this configuration change current is not propagated to Alluxio. If you want this feature you will have to implement dynamic configuration propagation.

jenoudet avatar Mar 01 '22 21:03 jenoudet

@jenoudet OK, thanks~ I have a few more questions:

  1. Do you have plans to improve such a scene in the future, or do you have any good ideas?
  2. we plan:
    1. Master Client update the address by monitoring the confhash change(RaftJournalSystem.updateGroup update ServerConfiguration.sSonf and hash can be changed)
    2. Increase the rpc heartbeat to get the master node addresses from the leader and update the client's address

ccy00808 avatar Mar 02 '22 08:03 ccy00808

My suggestion would be to look at the MasterInquireClient. Currently, an Embedded Journal deployment uses the PollingMasterInquireClient to poll masters to see if they are the leader. A new RaftInquireClient could be created using a RaftClient to poll the quorum for the leader information.

jenoudet avatar Mar 02 '22 18:03 jenoudet

@ccy00808 I would suggest instead of putting the static ip address in the configuration, put in the hostname. In kubernetes although the IP of the master pod changes, the hostname doesn't. Other pods should be able to find the new pod with the hostname.

ssz1997 avatar Aug 25 '22 18:08 ssz1997

@ccy00808 Any updates on this issue? Are you still encountering the problem?

ssz1997 avatar Oct 14 '22 19:10 ssz1997

Offline synced. It's not a problem any more.

ssz1997 avatar Oct 25 '22 01:10 ssz1997