[BugFix] Fix incorrect column mapping for Hive View when underlying table schema changes
Problem: When querying a Hive View (especially those defined as SELECT *), if the underlying Hive table schema has changed (e.g., new columns added), the column index in the View definition might differ from the underlying table. StarRocks previously used index-based mapping, causing filters (like partition pruning) to be applied to the wrong columns.
Solution: In QueryAnalyzer, introduced a name-based mapping mechanism for Hive Views.
- Check if it is a Hive View.
- Verify if all columns in the View's base schema exist in the underlying query output by name.
- If they match, use the column name to map the fields instead of the index.
Fixes: #66559
Why I'm doing:
When querying a Hive View (especially those defined as SELECT *), if the underlying Hive table schema has changed (e.g., new columns added), the column index in the View definition might differ from the underlying table. StarRocks previously used index-based mapping, causing filters (like partition pruning) to be applied to the wrong columns, resulting in empty results or errors.
What I'm doing:
In QueryAnalyzer, I introduced a name-based mapping mechanism for Hive Views.
- Check if the relation is a Hive View.
- Verify if all columns in the View's base schema exist in the underlying query output by name.
- If they match, use the column name to map the fields instead of the index. This ensures correct column mapping even if the underlying table structure changes.
Fixes #66559
What type of PR is this:
- [x] BugFix
- [ ] Feature
- [ ] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
Does this PR entail a change in behavior?
- [x] Yes, this PR will result in a change in behavior.
- [ ] No, this PR will not result in a change in behavior.
If yes, please specify the type of change:
- [x] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
Checklist:
- [ ] I have added test cases for my bug fix or my new feature
- [ ] This pr needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport pr
Bugfix cherry-pick branch check:
- [x] I have checked the version labels which the pr will be auto-backported to the target branch
- [x] 4.0
- [x] 3.5
- [x] 3.4
- [x] 3.3
[!NOTE] Switch to name-based field mapping for Hive views (when all columns match) to align view schema with query output; fallback to index mapping otherwise.
- Analyzer (
fe/fe-core/src/main/java/com/starrocks/sql/analyzer/QueryAnalyzer.java)
- For
visitViewon Hive views, build a case-insensitive map of query output fields by name and, when all base-schema columns exist by name, mapFieldusing names instead of indices.- Preserve original index-based mapping as fallback when name matching isn’t complete.
- No changes to non-Hive views or other analysis paths.
Written by Cursor Bugbot for commit 03f6013193166bc5f69dbc752cfd1cd2e46cfaec. This will update automatically on new commits. Configure here.
🧪 CI Insights
Here's what we observed from your CI run for 03f60131.
🟢 All jobs passed!
But CI Insights is watching 👀
Quality Gate passed
Issues
6 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
[Java-Extensions Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[BE Incremental Coverage Report]
:white_check_mark: pass : 0 / 0 (0%)
[FE Incremental Coverage Report]
:white_check_mark: pass : 11 / 11 (100.00%)
file detail
| path | covered_line | new_line | coverage | not_covered_line_detail | |
|---|---|---|---|---|---|
| :large_blue_circle: | com/starrocks/sql/analyzer/QueryAnalyzer.java | 11 | 11 | 100.00% | [] |
@cursor review