hive
hive copied to clipboard
HIVE-26404: HMS memory leak when compaction cleaner fails to remove obsolete files
What changes were proposed in this pull request and why?
- Add unit test to trigger the memory leak (not meant to be committed in the repo cause it is a bit hacky and slow)
- (Main fix) Ensure FileSystem resources are closed correctly by wrapping the code in try-finally block.
In order for the issue to appear the following requirements must be met:
- Secure HMS/HDFS communication
- Many compaction failures when removing files
- Many users doas running compactions
More info under HIVE-26404.
Does this PR introduce any user-facing change?
Fixes the memory leak.
How was this patch tested?
Run the new test with and without the fix.
Without the fix the test:
- it will either hang and there will be
OutOfMemoryintarget/surefire-reports/org.apache.hadoop.hive.ql.txn.compactor.TestCleanerWithSecureDFS-output.txt - it will finish after a long time and there will be
OutOfMemoryintarget/surefire-reports/org.apache.hadoop.hive.ql.txn.compactor.TestCleanerWithSecureDFS-output.txt
To confirm the memory leak it is the same with the one obseved in HIVE-26404 I took a heapdump when the test was running and validated that there are many FileSystem objects in the cache each holding Configuration objects.