spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-39217][SQL] Makes DPP support the pruning side has Union

Open wangyum opened this issue 3 years ago • 4 comments

What changes were proposed in this pull request?

Makes DPP support the pruning side has Union. For example:

SELECT f.store_id,
       f.date_id,
       s.state_province
FROM (SELECT 4 AS store_id,
               date_id,
               product_id
      FROM   fact_sk
      WHERE  date_id >= 1300
      UNION ALL
      SELECT   store_id,
               date_id,
               product_id
      FROM   fact_stats
      WHERE  date_id <= 1000) f
JOIN dim_store s
ON f.store_id = s.store_id
WHERE s.country IN ('US', 'NL')

After this PR: image

Why are the changes needed?

Improve query performance.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test.

wangyum avatar May 18 '22 05:05 wangyum

cc @cloud-fan

wangyum avatar May 18 '22 13:05 wangyum

A case from production: image

wangyum avatar May 20 '22 06:05 wangyum

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

github-actions[bot] avatar Aug 29 '22 00:08 github-actions[bot]

sorry for the late review, the change looks reasonable to me.

cloud-fan avatar Aug 30 '22 01:08 cloud-fan

@cloud-fan Do you have more comments?

wangyum avatar Sep 29 '22 08:09 wangyum

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

github-actions[bot] avatar Jan 08 '23 00:01 github-actions[bot]