spark
spark copied to clipboard
[SPARK-39069][SQL] Enhance ConstantPropagation to replace constants in inequality predicates
What changes were proposed in this pull request?
ConstantPropagation
currently only supports replace constants in equality predicates. For example: i = 5 AND j = i + 3
-> i = 5 AND j = 8
.
This PR enhances ConstantPropagation
to replace constants in inequality predicates. For example: i = 5 AND j > i + 3
-> i = 5 AND j > 8
.
Why are the changes needed?
Simplify filter condition to improve query performance. For example:
CREATE TABLE t1 (
id DECIMAL(18,0),
event_dt DATE,
cmpgn_run_dt DATE)
USING parquet
PARTITIONED BY (cmpgn_run_dt);
SELECT * FROM t1 WHERE CMPGN_RUN_DT >= date_sub(EVENT_DT,2) AND CMPGN_RUN_DT <= EVENT_DT AND EVENT_DT ='2022-04-05';
After this PR:
== Optimized Logical Plan ==
Filter ((((isnotnull(CMPGN_RUN_DT#2) AND isnotnull(EVENT_DT#1)) AND (CMPGN_RUN_DT#2 >= 2022-04-03)) AND (CMPGN_RUN_DT#2 <= 2022-04-05)) AND (EVENT_DT#1 = 2022-04-05))
+- Relation default.t1[id#0,event_dt#1,cmpgn_run_dt#2] parquet
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Unit test.
cc @sigmod @cloud-fan
cc @rkkorlapati-db @jchen5
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!