[AUTHZ] count(*)/count(1) should check sub nodes' privileges
Spark Optimizer's ColumnPruning will replace count(*)/count(1) Aggregate plan's child to a Project node with empty projection list:
object ColumnPruning extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = removeProjectBeforeFilter(
plan.transformWithPruning(AlwaysProcess.fn, ruleId) {
...
// Prunes the unused columns from child of Aggregate/Expand/Generate/ScriptTransformation
case a @ Aggregate(_, _, child) if !child.outputSet.subsetOf(a.references) =>
a.copy(child = prunedChild(child, a.references))
...
}
but AuthZ plugin's PrivilegesBuilder.buildQuery method will ignore to check child node when plan's inputSet is empty, in this scenario, Aggregate node's child plan's privileges are ignored, which cause count(*)/ count(1) will ignored all privileges that should be checked.
Why are the changes needed?
this patch add a holder node ChildOutputHolder when Aggregate node's references and it's child node's outputSet hava no intersection, it will hold the child node's outputSet used for build privilege objects, and ChildOutputHolder node will be eliminated after RuleAuthorization rule work completed.
How was this patch tested?
updated old unit tests and added new unit tests.
Was this patch authored or co-authored using generative AI tooling?
No.
related issue: #7173, cc @bowenliang123
Codecov Report
:x: Patch coverage is 0% with 17 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 0.00%. Comparing base (8b56295) to head (952548f).
:warning: Report is 1 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #7204 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 695 698 +3
Lines 43505 43495 -10
Branches 5888 5886 -2
======================================
+ Misses 43505 43495 -10
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
this approach lgtm, I suggest adding some comments.
cc @wForget will this affect lineage?
also cc @zhouyifan279 since you have worked closely on this part
cc @wForget will this affect lineage?
It seems not, ChildOutputHolder extends UnaryNode, so it will pass the child's lineage.