hive
hive copied to clipboard
HIVE-28051: LLAP: cleanup local folders on startup and periodically
What changes were proposed in this pull request?
Implement a LocalDirCleaner which can remove old files from LLAP local dirs.
Why are the changes needed?
Because LLAP cannot take care of cleaning up old files in every failure scenario, this was shown in a customer problem. When I investigated HIVE-24272 I found that local files for failed queries/DAGs are cleaned up, so this is clearly about leftovers after daemon crashes.
Does this PR introduce any user-facing change?
No.
Is the change a dependency upgrade?
No.
How was this patch tested?
Unit test added, tested on llap daemon, added a file as:
mkdir -p /apps/llap/work/usercache/hive/appcache/application_1707917402901_0001/3/output
touch /apps/llap/work/usercache/hive/appcache/application_1707917402901_0001/3/output/file.out
set quick intervals like:
hive.llap.local.dir.cleaner.file.modify.time.threshold=60s
hive.llap.local.dir.cleaner.cleanup.interval=30s
checked logs:
query-executor <14>1 2024-02-15T07:37:59.351Z query-executor-0-0 query-executor 1 79184f83-fdb0-4982-9bd9-4e297438f8be [mdc@18060 class="impl.LocalDirCleaner" level="INFO" thread="pool-14-thread-1"] Cleaning up files older than 2024-02-15T07:36:59.351485Z from /apps/llap/work
query-executor <14>1 2024-02-15T07:38:29.351Z query-executor-0-0 query-executor 1 79184f83-fdb0-4982-9bd9-4e297438f8be [mdc@18060 class="impl.LocalDirCleaner" level="INFO" thread="pool-14-thread-1"] Cleaning up files older than 2024-02-15T07:37:29.351488Z from /apps/llap/work
query-executor <14>1 2024-02-15T07:38:59.351Z query-executor-0-0 query-executor 1 79184f83-fdb0-4982-9bd9-4e297438f8be [mdc@18060 class="impl.LocalDirCleaner" level="INFO" thread="pool-14-thread-1"] Cleaning up files older than 2024-02-15T07:37:59.351487Z from /apps/llap/work
query-executor <14>1 2024-02-15T07:38:59.352Z query-executor-0-0 query-executor 1 79184f83-fdb0-4982-9bd9-4e297438f8be [mdc@18060 class="impl.LocalDirCleaner" level="INFO" thread="pool-14-thread-1"] Delete old file: /apps/llap/work/usercache/hive/appcache/application_1707917402901_0001/3/output/file.out
query-executor <14>1 2024-02-15T07:39:29.351Z query-executor-0-0 query-executor 1 79184f83-fdb0-4982-9bd9-4e297438f8be [mdc@18060 class="impl.LocalDirCleaner" level="INFO" thread="pool-14-thread-1"] Cleaning up files older than 2024-02-15T07:38:29.351486Z from /apps/llap/work
query-executor <14>1 2024-02-15T07:39:59.351Z query-executor-0-0 query-executor 1 79184f83-fdb0-4982-9bd9-4e297438f8be [mdc@18060 class="impl.LocalDirCleaner" level="INFO" thread="pool-14-thread-1"] Cleaning up files older than 2024-02-15T07:38:59.351485Z from /apps/llap/work
I think you missed updating the commit message with Hive jira
I think you missed updating the commit message with Hive jira
LOL, thanks, fixed
unrelated test failure, I can rerun eventually, this can be reviewed
Quality Gate passed
Issues
120 New issues
Measures
0 Security Hotspots
No data about Coverage
No data about Duplication
@zhangbutao : thanks for your comments, according to the last comment, I'm considering this as approved, let me know if it's otherwise
@zhangbutao : thanks for your comments, according to the last comment, I'm considering this as approved, let me know if it's otherwise
Yes, free to merge this change. :)