alluxio
alluxio copied to clipboard
EdgeLockPool leak slowly
data:image/s3,"s3://crabby-images/2435b/2435b4f91db72f50f9e42823d5ecd8d01fc1eb3a" alt="image"
i find that slow leak occure in EdgeLockPool and i don't know what happened , i find no ERROR log in master.log!!!
InodeLockPool is normal!!!
@uniqueZt do you mind provide the Alluxio version and configurations of your cluster?
version 2.7
i have two cluster, use the same configuration, but one of them is normal and the other one ocurrs the problem!
can you supply me some PR about this?let me analyze the problem?
I only find the PR "https://github.com/Alluxio/alluxio/pull/14320" , but i cannot find any ERROR info in my master.log
i try server operation,such as "create file, delete -R , create dir, list , getStatus , rename ", can't reproduction the problem ! i doubt that some exception may lead to the problem!
data:image/s3,"s3://crabby-images/85160/8516032820a17946ae3ebfcdaaaa4173ce829ae9" alt="image"
@uniqueZt
Hi I looked into the code you pasted a bit. If an error is thrown, the reference of the lock won't be held by the lock list and hence will be garbage collected later. No leak is caused by this try..catch.
Second, if these edge locks are really leaked, this means that these locks are in the InodeList and has been acquired by some threads. If a leak really happens -> locks are acquired by threads which never release them, then very likely you will see a dead lock happens, if the edge is read/written by other threads. According to your description, seems like the alluxio cluster works normally so far.
Third, according to the metric dashboard, I can see the # of edge locks are stable and grow very slowly. This makes me think if this is due to organic traffic growth instead of a leak issue. You can check the QPS to file system master to see if you see a similar growing cadence too.
This issue is very general and we are not able to do further investigation without more details revealed. Happy to help dig in if you can provide more details with us. Thanks!
i find a bug, and can reproduce.
data:image/s3,"s3://crabby-images/9c6ec/9c6ecd5c0e05a90c9e856b8a56add862fd35aad0" alt="image"
code in lock pool
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.