alluxio icon indicating copy to clipboard operation
alluxio copied to clipboard

One mount point's UFS has an issue caused the whole cluster stuck

Open secfree opened this issue 2 years ago • 5 comments

Alluxio Version: 2.7.1

Describe the bug One mount point's UFS has an issue caused the whole cluster stuck.

To Reproduce

  1. Mount a hdfs path to alluxio
./bin/alluxio fs mount /projects/test_hdfs hdfs://test/projects/test_hdfs
  1. Trasnsit the state of hdfs namenode to standby
hdfs -ns test -transitionToStandby nn --forcemanual
  1. Do listStatus for the hdfs path from alluxio, it gets stuck and the log showes that AlluxioMaster is doing sleep and retry (connect to namenode)
./bin/alluxio fs ls /projects/test_hdfs
  1. Do listStatus for the root path from alluxio, it also gets stuck. It needs the -f option here, otherwise needs to wait alluxio.user.file.metadata.sync.interval to take effect
./bin/alluxio fs ls -f /
  1. Do listStatus for other paths also gets stuck
./bin/alluxio fs ls -f /user/test

Expected behavior listStatus of / and other paths should be affected by the mount point's UFS issue. They should be isolated.

Cause Analysis

  1. listStatus for a mount point with issue keeps holding READ lock of the path.
  2. listStatus for / path keeps waiting for WRITE_EDGE lock of /. The corresponding logic is here: https://github.com/Alluxio/alluxio/blob/master/core/server/master/src/main/java/alluxio/master/file/meta/LockingScheme.java#L83. It cannot get the lock before all READ lock released.
  3. listStatus for other paths needs to hold READ lock, then it needs to wait for ./bin/alluxio fs ls / to finish first to release the WRITE_EDGE lock of /.

Urgency High

Are you planning to fix it No

secfree avatar Aug 04 '22 05:08 secfree

More information is required:

  1. jstack of different processes
  2. log of master and relate processes

yyongycy avatar Aug 04 '22 06:08 yyongycy

jstack of ./bin/alluxio fs ls /projects/test_hdfs in AlluxioMaster

"master-rpc-executor-TPE-thread-102" #7512 daemon prio=5 os_prio=0 tid=0x00007fe2d0023800 nid=0x526a waiting on condition [0x00007fe23e5e3000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007512cc650> (a java.util.concurrent.FutureTask)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:426)
        at java.util.concurrent.FutureTask.get(FutureTask.java:204)
        at alluxio.master.file.InodeSyncStream.getFromUfs(InodeSyncStream.java:518)
        at alluxio.master.file.InodeSyncStream.syncExistingInodeMetadata(InodeSyncStream.java:591)
        at alluxio.master.file.InodeSyncStream.syncInodeMetadata(InodeSyncStream.java:510)
        at alluxio.master.file.InodeSyncStream.sync(InodeSyncStream.java:327)
        at alluxio.master.file.DefaultFileSystemMaster.syncMetadata(DefaultFileSystemMaster.java:3524)
        at alluxio.master.file.DefaultFileSystemMaster.listStatus(DefaultFileSystemMaster.java:1008)
        at alluxio.master.file.FileSystemMasterClientServiceHandler.lambda$listStatus$9(FileSystemMasterClientServiceHandler.java:243)
        at alluxio.master.file.FileSystemMasterClientServiceHandler$$Lambda$900/852315736.call(Unknown Source)
        at alluxio.RpcUtils.callAndReturn(RpcUtils.java:121)
        at alluxio.master.file.FileSystemMasterClientServiceHandler.listStatus(FileSystemMasterClientServiceHandler.java:241)
        ...

and master.log

2022-08-04 14:28:25,308 INFO  RetryInvocationHandler - org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:108)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2088)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1428)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3051)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1232)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:894)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1087)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1109)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1030)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2038)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3039)
, while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over xxxx:8020 after 1 failover attempts. Trying to failover after sleeping for 729ms.
2022-08-04 14:28:26,046 INFO  RetryInvocationHandler - org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:108)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2076)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1422)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3001)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1228)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:894)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1087)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1109)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1030)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2038)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3039)
, while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over xxxx:8020 after 2 failover attempts. Trying to failover after sleeping for 1708ms.

secfree avatar Aug 04 '22 06:08 secfree

jstack of ./bin/alluxio fs ls / in AlluxioMaster

   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000079af85890> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
        at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
        at alluxio.resource.LockResource.<init>(LockResource.java:99)
        at alluxio.resource.LockResource.<init>(LockResource.java:70)
        at alluxio.resource.RWLockResource.<init>(RWLockResource.java:39)
        at alluxio.resource.RefCountLockResource.<init>(RefCountLockResource.java:54)
        at alluxio.collections.LockPool.get(LockPool.java:210)
        at alluxio.master.file.meta.InodeLockManager.lockEdge(InodeLockManager.java:195)
        at alluxio.master.file.meta.SimpleInodeLockList.lockAndAddEdge(SimpleInodeLockList.java:226)
        at alluxio.master.file.meta.SimpleInodeLockList.lockRootEdge(SimpleInodeLockList.java:115)
        at alluxio.master.file.meta.LockedInodePath.bootstrapTraversal(LockedInodePath.java:469)
        at alluxio.master.file.meta.LockedInodePath.traverse(LockedInodePath.java:401)
        at alluxio.master.file.meta.InodeTree.lockInodePath(InodeTree.java:434)
        at alluxio.master.file.meta.InodeTree.lockInodePath(InodeTree.java:416)
        at alluxio.master.file.meta.InodeTree.lockInodePath(InodeTree.java:402)
        at alluxio.master.file.InodeSyncStream.sync(InodeSyncStream.java:313)
        at alluxio.master.file.DefaultFileSystemMaster.syncMetadata(DefaultFileSystemMaster.java:3524)
        at alluxio.master.file.DefaultFileSystemMaster.listStatus(DefaultFileSystemMaster.java:1008)

secfree avatar Aug 04 '22 06:08 secfree

@secfree according to your analysis, listStatus for / would be blocked but not listStatus for any other subpath right? but it seems like you were getting blocked on ./bin/alluxio fs ls -f /user/test. Could you confirm that you were blocked on listing the root as well as any parallel file path?

yuzhu avatar Aug 08 '22 23:08 yuzhu

@yuzhu sorry, step 4 should be ./bin/alluxio fs ls / as the description Do listStatus for the root path from alluxio, it also gets stuck. I have corrected it.

For listing other file paths, if it does not need to wait for the read lock of /projects/test_hdfs to release, it will not block.

secfree avatar Aug 09 '22 00:08 secfree