hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-27995: Fix inconsistent behavior of LOAD DATA command for partitoned and non-partitioned tables

Open shivjha30 opened this issue 2 years ago • 5 comments

What changes were proposed in this pull request?

Earlier, the code flows for partitioned and non-partitioned tables were different. The partitioned tables skipped constraints checks before submitting the job for execution. This Pull request ensures that both, the partitioned and non-partitioned tables go through the constraints validations in applyConstraintsAndGetFiles function.

Why are the changes needed?

For partitioned tables, while executing LOAD DATA/ LOAD DATA LOCAL commands, the check for file existence is not executed on HiveServer2, and this in turn throws java.io.FileNotFoundException during Runtime once the job is launched. This PR prevents such cases at compile time.

Does this PR introduce any user-facing change?

No

Is the change a dependency upgrade?

No

How was this patch tested?

The test cases already exist. The error messages prompted back to the user are now consistent if the file is not found at HiveServer2

Load Data Error (Non Partitioned Tables) Load Data Error

File Not Found Exception (Partitioned Tables) Load

Fixed: For partitioned tables Load (1)

shivjha30 avatar Jan 11 '24 13:01 shivjha30

@ayushtkn Thanks for the review, i have added the test case as mentioned above.please review once

shivjha30 avatar Feb 01 '24 05:02 shivjha30

There is a test failure in Jenkins, i ran the test in in local and that test is passing which can be viewed in the below screenshot: image

shivjha30 avatar Feb 01 '24 05:02 shivjha30

Quality Gate Passed Quality Gate passed

Issues
1 New issue

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Feb 14 '24 11:02 sonarqubecloud[bot]

This scenario will fail before fixing this issue at job execution and now it is validated at Hiveserver and it will throw SemanticExceptoin to the client. What is the exception received by client before fixing this. I think exception will be different, can you add what exception client is receiving before fixing this issue.

chinnaraolalam avatar Feb 28 '24 05:02 chinnaraolalam

Quality Gate Passed Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar May 03 '24 19:05 sonarqubecloud[bot]

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.

github-actions[bot] avatar Jul 04 '24 00:07 github-actions[bot]

Overall changes looks good to me, But locally when i run newly added test case it falling

[ERROR] TestNegativeLlapCliDriver.testCliDriver:59 Client Execution succeeded but contained differences (error code = 1) after executing load_data_partition.q 15,21c15 < FAILED: SemanticException Line 2:17 Invalid path ''hdfs://### HDFS PATH ###'': No files matching path hdfs://### HDFS PATH ### < FAILED: AssertionError java.lang.AssertionError: Client Execution succeeded but contained differences (error code = 1) after executing load_data_partition.q < 15c15 < < FAILED: SemanticException Line 2:17 Invalid path ''hdfs://### HDFS PATH ###'': No files matching path hdfs://### HDFS PATH ###

chinnaraolalam avatar Jul 11 '24 05:07 chinnaraolalam

+1 Thanks for the review Ayush Saxena, Denys Kuzmenko

chinnaraolalam avatar Jul 19 '24 05:07 chinnaraolalam