spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-39069][SQL] Enhance ConstantPropagation to replace constants in inequality predicates

Open wangyum opened this issue 2 years ago • 2 comments

What changes were proposed in this pull request?

ConstantPropagation currently only supports replace constants in equality predicates. For example: i = 5 AND j = i + 3 -> i = 5 AND j = 8. This PR enhances ConstantPropagation to replace constants in inequality predicates. For example: i = 5 AND j > i + 3 -> i = 5 AND j > 8.

Why are the changes needed?

Simplify filter condition to improve query performance. For example:

CREATE TABLE t1 (
  id DECIMAL(18,0),
  event_dt DATE,
  cmpgn_run_dt DATE)
USING parquet
PARTITIONED BY (cmpgn_run_dt);

SELECT * FROM t1 WHERE CMPGN_RUN_DT >= date_sub(EVENT_DT,2) AND CMPGN_RUN_DT <= EVENT_DT AND EVENT_DT ='2022-04-05';

After this PR:

== Optimized Logical Plan ==
Filter ((((isnotnull(CMPGN_RUN_DT#2) AND isnotnull(EVENT_DT#1)) AND (CMPGN_RUN_DT#2 >= 2022-04-03)) AND (CMPGN_RUN_DT#2 <= 2022-04-05)) AND (EVENT_DT#1 = 2022-04-05))
+- Relation default.t1[id#0,event_dt#1,cmpgn_run_dt#2] parquet

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test.

wangyum avatar Jun 12 '22 08:06 wangyum

cc @sigmod @cloud-fan

wangyum avatar Jun 15 '22 07:06 wangyum

cc @rkkorlapati-db @jchen5

sigmod avatar Sep 06 '22 16:09 sigmod

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

github-actions[bot] avatar Jan 09 '23 00:01 github-actions[bot]