[SPARK-53733][SQL] Delay `resolveColsLastResort` until all previous `UnresolvedAlias`es are resolved
What changes were proposed in this pull request?
Delay resolveColsLastResort until all UnresolvedAliases and top-level UnresolvedAttributes that come before the column that is being resolved, are resolved
Why are the changes needed?
For the follwing query:
DECLARE a = 'aa';
SELECT 'a', a;
Spark incorrectly resolves the second a column to the variable instead of resolving it as a lateral column alias reference to the implicit alias of literal 'a'. This is not consistent with the current intended behavior and name resolution precedence in Spark:
DECLARE a = 'aa';
SELECT 'a' AS a, a; -- second column resolved as LCA
SELECT 'b', b -- second column resolved to the implicit alias of literal 'b'
Similarly, the fix applies to precedence of LCAs over outer references as in this query:
SELECT col1
FROM VALUES(1)
WHERE EXISTS (SELECT 'col1', col1);
Does this PR introduce any user-facing change?
Yes, user now sees the correct result
How was this patch tested?
Added golden file tests for affected queries.
Was this patch authored or co-authored using generative AI tooling?
No
another example of a discrepancy between single-pass and fixed-point
select session_var, (case when date_format(session_var,'unknown') in ('XXXXX','XXXXX','XXXXX','XXXXX') then 1 else 1 end) lca
another example of a discrepancy between single-pass and fixed-point
select session_var, (case when date_format(session_var,'unknown') in ('XXXXX','XXXXX','XXXXX','XXXXX') then 1 else 1 end) lca
Yeah, it's the same issue, except we get UnresolvedAttribute as a top-level from parser. This will later become an implicit alias. Changed the code slightly to accommodate this