alluxio icon indicating copy to clipboard operation
alluxio copied to clipboard

Fix AlluxioFileSystem memory leak

Open qian0817 opened this issue 1 year ago • 4 comments

What changes are proposed in this pull request?

Fix #18479

Why are the changes needed?

Fix the memory leak issue when the client accesses Alluxio through Hadoop FileContext API.

Does this PR introduce any user facing changes?

No.

qian0817 avatar Jan 02 '24 06:01 qian0817

Automated checks report:

  • PR title follows the conventions: FAIL
    • The title of the PR does not pass all the checks. Please fix the following issues:
      • First word must be capitalized
      • First word of title ("fix:") is not an imperative verb. Please use one of the valid words
  • Commits associated with Github account: FAIL
    • It looks like your commits can't be linked to a valid Github account. Your commits are made with the email [email protected], which does not allow your contribution to be tracked by Github. See this link for possible reasons this might be happening. To change the author email address that your most recent commit was made under, you can run:
      git -c user.name="Name" -c user.email="Email" commit --amend --reset-author
      
      See this answer for more details about how to change commit email addresses. Once the author email address has been updated, update the pull request by running:
      git push --force https://github.com/qian0817/alluxio.git fix-abstract-filesystem-leak
      

Some checks failed. Please fix the reported issues and reply 'alluxio-bot, check this please' to re-run checks.

alluxio-bot avatar Jan 02 '24 07:01 alluxio-bot

Automated checks report:

  • PR title follows the conventions: PASS
  • Commits associated with Github account: PASS

All checks passed!

alluxio-bot avatar Jan 02 '24 07:01 alluxio-bot

@qian0817 I'm not sure if the core/client/hdfs/src/main/java/alluxio/hadoop/AlluxioFileSystem.java is only used in YARN and does not impact other hadoop-based apps like Spark/Presto/Impala. Could you confirm on that? Thanks!

jiacheliu3 avatar Feb 02 '24 03:02 jiacheliu3

I search FileContext keyword under the spark, presto, and impala repositories and find spark will use filecontext. https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala#L308-L312 presto and impala will not use.

qian0817 avatar Feb 23 '24 02:02 qian0817

alluxio-bot, merge this please

jiacheliu3 avatar Mar 06 '24 07:03 jiacheliu3