HIVE-28595: Not able to insert data with hive.cbo.enable=false
What changes were proposed in this pull request?
Added code for hadnling INSERT INTO statement with target columns specifed in the case of CBO=false.
Why are the changes needed?
Does this PR introduce any user-facing change?
No
Is the change a dependency upgrade?
No
How was this patch tested?
mvn test -Dtest=TestMiniLlapCliDriver -Dqfile=insert_cbo_false.q
I've analysed how it's working in the case of CBO=true, and tried to replicate the same for CBO=false.
Sample case analysed:
CBO true
TableScanOperator TS[0]
SelectOperator SEL[1] (_col0: array<struct<col1:string,col2:int,col3:decimal(3,2)>>|{null}_c1) size = 1
UDTFOperator UDTF[2] (col1: string|{$hdt$_0}col1,col2: int|{$hdt$_0}col2,col3: decimal(3,2)|{$hdt$_0}col3) size = 3
SelectOperator SEL[3] (_col0: string|{null}_col0,_col1: int|{null}_col1,_col2: decimal(3,2)|{null}_col2) size = 3
FileSinkOperator FS[4]
CBO false
TableScanOperator TS[0]
SelectOperator SEL[1] (_col0: array<struct<col1:string,col2:int,col3:decimal(3,2)>>|{null}_c1) size = 1
UDTFOperator UDTF[2] (col1: string|{null}col1,col2: int|{null}col2,col3: decimal(3,2)|{null}col3) size = 3
FileSinkOperator FS[3]
CBO false (Updated in this PR)
TableScanOperator TS[0]
SelectOperator SEL[1] (_col0: array<struct<col1:string,col2:int,col3:decimal(3,2)>>|{null}_c1) size = 1
UDTFOperator UDTF[2] (col1: string|{null}col1,col2: int|{null}col2,col3: decimal(3,2)|{null}col3) size = 3
SelectOperator SEL[3] (_col0: string|{null}_col0,_col1: int|{null}_col1,_col2: decimal(3,2)|{null}_col2) size = 3
FileSinkOperator FS[4]
From the sonarqube scan, the issues are not new but from existing code. @okumin Could you please review? Thank you!
@zabetak @deniskuzZ @zhangbutao Could you please help to review? Thank you.
force-pushing again only for green build
Quality Gate passed
Issues
8 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
Thanks for creating a PR. Do you still have a use case disabling CBO on the master branch, i.e., 4.1.0 or newer versions?
Thanks for creating a PR. Do you still have a use case disabling CBO on the master branch, i.e., 4.1.0 or newer versions?
No, according to the comment by @zabetak on this Jira which states disabling CBO is soon to be deprecated. But since we’re still accepting CBO=false patches to the master branch, I went ahead and created this PR.
Thanks. Would you mind if we prioritized other PRs? I agree with the goal of HIVE-28595. However, SemanticAnalyzer is complex and critical for data integrity. We might need to review it carefully
Sure, I completely understand and you may deprioritize this. Please review it later when you get time.
@okumin gentle reminder, if you could pick this one up.
@KiranVelumuri Thanks for the reminder. I'm delighted that you turned to me. But you may ask another person if you would like to get this merged.
For transparency, I am already involved with 13 pull requests as an author and reviewer, some spec discussions, and am writing some documents to reduce the roundtrip messages on reviews. So, I need more time to make me available for reviewing non-urgent but non-obvious changes.
@deniskuzZ @abstractdog @zhangbutao @wecharyu @dengzhhu653 Could you please help to review? Thank you.
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.
@deniskuzZ I updated the commit to incorporate your comments. Please review.
Quality Gate passed
Issues
3 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
hi @KiranVelumuri, changes LGTM, however, there seems to be a decline from the compiler team since non-CBO path should be disabled going forward. @kasakrisz, @zabetak, please confirm
Hey @deniskuzZ , I have nothing against merging the patch especially if you already did a review.
As I mentioned under the ticket, the non-CBO path is not recommended and the intention is to deprecate the respective option in the near future. We have limited resources so it would be better if contributors and reviewers focus our efforts on those options that we will continue to support in the long-term.
Personally, I treat anything that affects only the cbo=false path as very low priority.
Hey @deniskuzZ , I have nothing against merging the patch especially if you already did a review.
As I mentioned under the ticket, the non-CBO path is not recommended and the intention is to deprecate the respective option in the near future. We have limited resources so it would be better if contributors and reviewers focus our efforts on those options that we will continue to support in the long-term.
Personally, I treat anything that affects only the cbo=false path as very low priority.
In that case, I'm adding my +1 to ensure that the work and effort already invested aren't thrown away