Support ALL operator
Hello!
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
SELECT 1 = ALL(ARRAY[1,1,1,1]);
Describe the solution you'd like A clear and concise description of what you want to happen.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.
Thanks
Hi @alamb,
Is this feature already supported? ALL seems not to work right now:
SELECT 1 = ALL(ARRAY[1,1,1,1]);
This feature is not implemented: Unsupported ast node in sqltorel: AllOp
May I take this issue?
Feel free @NiwakaDev -- it might be worth checking / ensuring none of the existing fucntions do this: https://datafusion.apache.org/user-guide/sql/scalar_functions.html#array-functions
If you want to add support, I think the right hting to do is to add a function array_all and then add a rule in the sql planner to plan the ALL... syntax to be a function call to array_all
This is similar to how the ARRAY(..) syntax is handled (it is rewritten to a call to the make_array function)
take
Hey @NiwakaDev just checking in if you're still working on this; otherwise I'll remove the assignment to free it up
Hey @Jefffrey, I was actually working on this just now. If itβs not completed by the end of October, feel free to unassign me.
Some pointers:
Planner code here, need to wire in support for AllOp node:
https://github.com/apache/datafusion/blob/2a08013af3ccf703bee202c959b40bb0d35bdea1/datafusion/sql/src/expr/mod.rs#L632-L635
See how AnyOp is handled:
https://github.com/apache/datafusion/blob/2a08013af3ccf703bee202c959b40bb0d35bdea1/datafusion/sql/src/expr/mod.rs#L590-L622
See implementation for ArrayHasAll which can be used (at least for =) case here:
https://github.com/apache/datafusion/blob/2a08013af3ccf703bee202c959b40bb0d35bdea1/datafusion/functions-nested/src/array_has.rs
- Ideally add support for operators at a time; that is start with just
=then add support for<,<=, etc. in subsequent PRs
Add SLT tests here:
https://github.com/apache/datafusion/blob/2a08013af3ccf703bee202c959b40bb0d35bdea1/datafusion/sqllogictest/test_files/array.slt
take
Hey @Jefffrey ππ»
Thanks for listing down steps for this implementation! I was trying to implement and understand the same, I added AllOp in datafusion/datafusion/sql/src/expr/mod.rs and got to constructing a plan_all equivalent to plan_any. After this, you mention above, one would have to use ArrayHasAll. Documentation of array_has_all says:
returns true if each element of the second array appears in the first array; otherwise, it returns false.
This seems to be expecting two arrays, but let's say for query like given in parent message comment:
SELECT 1 = ALL(ARRAY[1,1,1,1]);
We don't have two arrays, instead one scalar and one array. We can't use ArrayHas and ArrayHasAny as they don't seem to satisfy the exact condition of all elements matching, or they expect both parameters as arrays.
I see two ways out of this:
- I expand a scalar like
1in the case above to an array of the same size as insideALL - I implement a different function for use in our case, something like
ArrayHasAllElementsEqlTofor scalar use- And for cases where we have an array on the left, we use
ArrayHasAlland make sure both arrays have the same length
- And for cases where we have an array on the left, we use
My understanding is derived from trying a few queries on these playgrounds:
- https://aiven.io/tools/pg-playground, and
- https://www.w3schools.com/sql/sql_any_all.asp
Do correct me if I am understanding this wrong π
Thanks for the detailed investigation @feniljain
You're right in that it seems ArrayHasAll doesn't seem to be usable out of the box as I initially thought. It seems ArrayHasAll is geared towards checking that array B is a subset of array A; so even if array B is a single element then as long as array A has one element of that array B then it will return true, which isn't what we want π€
We might need to create a new function entirely as you suggested. I might need some more time to think on this as I've only taken a quick look so far.
Actually thinking about it more, perhaps this is related to:
- https://github.com/apache/datafusion/issues/2548
In that these SQLs are equivalent:
SELECT 1 = ALL(ARRAY[1,1,1,1]);
SELECT NOT (1 <> ANY(ARRAY[1,2,3,4]))
So maybe we'd extend array_has to support !=/<> and then simplify ALL to be the equivalent of using <> ANY but placing a NOT on top of the final result? π€
I tried out some queries to make sense of it, like:
salary > ALL (SELECT salary FROM employees WHERE department = 'HR')
is equivalent to:
NOT (
salary <= ANY (SELECT salary FROM employees WHERE department = 'HR')
)
And yes it makes sense, we can definitely replace ALL with ANY + NOT + inverse operator. I see other issue is already being worked by @darshanime, maybe I will unassign myself here? If anything else is needed, would be glad to jump back in :)