hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-26404: HMS memory leak when compaction cleaner fails to remove obsolete files

Open zabetak opened this issue 3 years ago • 0 comments

What changes were proposed in this pull request and why?

  • Add unit test to trigger the memory leak (not meant to be committed in the repo cause it is a bit hacky and slow)
  • (Main fix) Ensure FileSystem resources are closed correctly by wrapping the code in try-finally block.

In order for the issue to appear the following requirements must be met:

  • Secure HMS/HDFS communication
  • Many compaction failures when removing files
  • Many users doas running compactions

More info under HIVE-26404.

Does this PR introduce any user-facing change?

Fixes the memory leak.

How was this patch tested?

Run the new test with and without the fix.

Without the fix the test:

  • it will either hang and there will be OutOfMemory in target/surefire-reports/org.apache.hadoop.hive.ql.txn.compactor.TestCleanerWithSecureDFS-output.txt
  • it will finish after a long time and there will be OutOfMemory in target/surefire-reports/org.apache.hadoop.hive.ql.txn.compactor.TestCleanerWithSecureDFS-output.txt

To confirm the memory leak it is the same with the one obseved in HIVE-26404 I took a heapdump when the test was running and validated that there are many FileSystem objects in the cache each holding Configuration objects.

zabetak avatar Aug 11 '22 17:08 zabetak