HIVE-27884: LLAP: Reuse FileSystem objects from cache across different tasks in the same LLAP daemon
What changes were proposed in this pull request?
Close the filesystem for a specific ugi once per query + daemon instead of every task. In particular this patch:
- moves UGI creation from task running logic (TaskRunnerCallable) to the factory
- introduces file system closure in the same factory
- removes unnecessary null checks
Why are the changes needed?
Performance. Creating a new filesystem is costly, and closing it too frequently completely disables hadoop's FileSystem CACHE
Does this PR introduce any user-facing change?
No.
Is the change a dependency upgrade?
No.
How was this patch tested?
Tested on LLAP cluster.
- ran multiple tpcds queries
- checked logs (only 1 ugi/fs creation happened for a query, every other tasks fell into reusing the same ugi)
- filesystem is closed after query
Kudos, SonarCloud Quality Gate passed! 
0 Bugs
0 Vulnerabilities
0 Security Hotspots
10 Code Smells
No Coverage information
No Duplication information
The version of Java (11.0.8) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17.
Read more here
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.
Quality Gate passed
Issues
5 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
Is the credentials fix - the one Kokila pointed out going in before further merge? Otherwise looks good
yes, tried to clarify this in code comments, but let me add links here for more clarity:
* Regarding vertex user: LlapTaskCommunicator has a single "user" field,
* which is passed into the SignableVertexSpec.
https://github.com/apache/hive/blob/e0bd9eac32851b27976a78cd83584cb867b83182/llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java#L125
https://github.com/apache/hive/blob/e0bd9eac32851b27976a78cd83584cb867b83182/llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java#L919
* Regarding credentials: LlapTaskCommunicator creates SubmitWorkRequestProto instances,
* into which dag-level credentials are passed.
https://github.com/apache/hive/blob/e0bd9eac32851b27976a78cd83584cb867b83182/llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java#L918
so assuming that the user is constant and credentials are DAG-level ones, we can make sure that the UserGroupInformation and the contained Credentials are not supposed to change withing a dag
@deniskuzZ: your comments were addressed in https://github.com/apache/hive/pull/4882/commits/c84f536c64cf6b298109473c363434d7acbf816e
LGTM, some minor stuff
thanks! addressed in https://github.com/apache/hive/pull/4882/commits/506c55d07c7e6f761c1295bfcdd6e6ec4d1c1f1e
Quality Gate passed
Issues
9 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
ready to merge, I checked the refactored patch on cluster, everything works as expected, only 1 ugi was created for a query:
query-executor <14>1 2024-09-16T08:28:43.780Z query-executor-0-0 query-executor 1 12b073f6-5537-44d0-8aca-578096c88c8d [mdc@38374 class="llap.LlapUgiManager" dagId="dag_1726475155829_0000_2" fragmentId="1726475155829_0000_2_01_000000_0" level="INFO" queryId="hive_20240916082842_806d9e6f-0a82-43cc-95ba-30b947b7fe0f" thread="IPC Server handler 4 on 25000"] Created ugi hive (auth:SIMPLE) for queryIdentifier 'QueryIdentifier{appIdentifier='application_1726475155829_0000', dagIdentifier=2}', current ugis #0