HIVE-29116 : Create a DDL for setting hive default partition name at the table level
- Expose a DDL to set default partition name at table level
- Restrict set hive.exec.default.partition at runtime
- use table property to check the default partition along with hive.exec.default.partition at cluster level
For more details refer: https://issues.apache.org/jira/browse/HIVE-29116
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
How was this patch tested?
@deniskuzZ , @chinnaraolalam could you please take a look at this PR?
Thanks for reviewing this PR @ngsg @Indhumathi27 I will handle the review comments, push the changes soon.
@vikramahuja1001 , i think change is required in org.apache.hadoop.hive.ql.exec.DDLPlanUtils#checkIfDefaultPartition()
https://github.com/apache/hive/blob/73503291cbdd0f95bc33add0769234fed6683840/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLPlanUtils.java#L307
Remove hardcoding from here as well + update comments as well accordingly org.apache.hadoop.hive.common.FileUtils#escapePathName(java.lang.String, java.lang.String) https://github.com/apache/hive/blob/413069ea0faeb893fe78faa215b08a70ece80595/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L298
https://github.com/apache/hive/blob/90820ac0a241468482073beb269bcaa3928fbc6f/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FileUtils.java#L310
Remove hardcoding from following place:
https://github.com/apache/hive/blob/733d4f09011f27bea17a5301537a9c6174f28874/hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java#L715
https://github.com/apache/hive/blob/ba0217ff17501fb849d8999e808d37579db7b4f1/hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatFileUtil.java#L71
@Aggarwal-Raghav ,very good point, makes sense. Thanks for pointing this out. There are indeed a lot of places where HIVE_DEFAULT_PARTITION is hard coded. Have put a few commits to fix that.
@ngsg , @Aggarwal-Raghav , @Indhumathi27 i have reworked all the comments and pushed all the changes. In the build pipeline all the tests are passing but there seems to be javadoc issue with the thrift gen code, no idea what's causing that. Looking at that atm.
Could you guys please re-review it again?
@chinnaraolalam and @deniskuzZ , could you also please take a look at this PR and maybe review it?
Thanks a lot for the hard work on this PR. I haven't fully reviewed all the changes yet (the diff is quite large), but let me leave a few comments for now:
-
In some cases, handling default partition name isn't necessary. I would suggest introducing new methods with Preconditions for those cases to avoid passing {tableParam, conf} or a default partition name. For example, when calling
Warehouse#escapePathNameinWarehouse#makePartNameUtil:e.getKey()doesn't require a default partition name, ande.getValue()is guaranteed not to be empty or null. -
Extending the above point, I would suggest passing just a default partition name instead of both tableParam and configuration whenever possible. For example, since
FileUtils#escapePathNamesalready takesdefaultPath, it would be better to require callers to always pass a proper default partition name. This way, we propagate only the minimal required information and lower the risk of human error when dealing with a potentially mutable map instance. -
Regarding
InsertHiveLocksCommand, I would suggest considering alternatives. Do we really need to ship the entire content of tableParams overLockComponent? Could we instead just send the default partition name, or ensure that the partition name is not null or empty? (We might even assume there is no empty string or null based on the implementation ofcom.google.common.base.Splitter$MapSplitter#split, though this code is outside our control.) -
I haven't fully thought this through, but since the changes now affect the Thrift interface, I would like to hear others' opinions. What do you think about extending the definition of
TableandPartitionin the Thrift interface instead of using tableParams? In this way, we might be able to simplify default partition determining steps.
Also, regarding the error in Javadoc, it seems the issue comes from the javadoc comment in LockComponentBuilder#setTableParams: the parameter name should be tableParams, not tableName.
@ngsg , thanks for spending time to review this PR and providing your feedback. I will rework as per your above comment and push the changes in a few days. Thanks
Quality Gate passed
Issues
113 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
1.5% Duplication on New Code
@ngsg Apologies for the delayed response. Couldn't revert back/rework due to limited connectivity. I have reworked the complete PR wrt most of the points mentioned in your previous comment.
Please take a look at the PR once. Let me know if something else also needs to be handled.
Cc @chinnaraolalam , could you also take a look at this PR
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.