spark
spark copied to clipboard
[SPARK-46741][SQL] Cache Table with CTE won't work
What changes were proposed in this pull request?
Reopen https://github.com/apache/spark/pull/44767 Cache Table with CTE won't work, there are two reasons
- In the current code CTE in CacheTableAsSelect will be inlined
- CTERelation Ref and Def didn't handle the CTEId doCanonicalize issue Cause the current case can't be matched.
Why are the changes needed?
Fix Bug
Does this PR introduce any user-facing change?
Yea, Cache table with CTE can work after this pr
For added cache.sql final query
EXPLAIN EXTENDED SELECT * FROM cache_nested_cte_table;
Before this pr, the plan as below, cache won't work.
After this pr
How was this patch tested?
Added UT
Was this patch authored or co-authored using generative AI tooling?
No
@cloud-fan Could you take a look again? in https://github.com/apache/spark/pull/44767 you approved but not merge this pr. This bug really impact product job's performance
ping @cloud-fan
thanks, merging to master/4.1!