datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Support ALL operator

Open ovr opened this issue 3 years ago β€’ 6 comments

Hello!

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

SELECT 1 = ALL(ARRAY[1,1,1,1]);

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Thanks

ovr avatar May 16 '22 16:05 ovr

Hi @alamb,

Is this feature already supported? ALL seems not to work right now:

SELECT 1 = ALL(ARRAY[1,1,1,1]);
This feature is not implemented: Unsupported ast node in sqltorel: AllOp 

May I take this issue?

NiwakaDev avatar Jan 09 '25 13:01 NiwakaDev

Feel free @NiwakaDev -- it might be worth checking / ensuring none of the existing fucntions do this: https://datafusion.apache.org/user-guide/sql/scalar_functions.html#array-functions

If you want to add support, I think the right hting to do is to add a function array_all and then add a rule in the sql planner to plan the ALL... syntax to be a function call to array_all

This is similar to how the ARRAY(..) syntax is handled (it is rewritten to a call to the make_array function)

alamb avatar Jan 13 '25 20:01 alamb

take

NiwakaDev avatar Jan 19 '25 02:01 NiwakaDev

Hey @NiwakaDev just checking in if you're still working on this; otherwise I'll remove the assignment to free it up

Jefffrey avatar Sep 20 '25 07:09 Jefffrey

Hey @Jefffrey, I was actually working on this just now. If it’s not completed by the end of October, feel free to unassign me.

NiwakaDev avatar Sep 21 '25 02:09 NiwakaDev

Some pointers:

Planner code here, need to wire in support for AllOp node:

https://github.com/apache/datafusion/blob/2a08013af3ccf703bee202c959b40bb0d35bdea1/datafusion/sql/src/expr/mod.rs#L632-L635

See how AnyOp is handled:

https://github.com/apache/datafusion/blob/2a08013af3ccf703bee202c959b40bb0d35bdea1/datafusion/sql/src/expr/mod.rs#L590-L622

See implementation for ArrayHasAll which can be used (at least for =) case here:

https://github.com/apache/datafusion/blob/2a08013af3ccf703bee202c959b40bb0d35bdea1/datafusion/functions-nested/src/array_has.rs

  • Ideally add support for operators at a time; that is start with just = then add support for <, <=, etc. in subsequent PRs

Add SLT tests here:

https://github.com/apache/datafusion/blob/2a08013af3ccf703bee202c959b40bb0d35bdea1/datafusion/sqllogictest/test_files/array.slt

Jefffrey avatar Dec 10 '25 13:12 Jefffrey

take

feniljain avatar Dec 12 '25 17:12 feniljain

Hey @Jefffrey πŸ‘‹πŸ»

Thanks for listing down steps for this implementation! I was trying to implement and understand the same, I added AllOp in datafusion/datafusion/sql/src/expr/mod.rs and got to constructing a plan_all equivalent to plan_any. After this, you mention above, one would have to use ArrayHasAll. Documentation of array_has_all says:

returns true if each element of the second array appears in the first array; otherwise, it returns false.

This seems to be expecting two arrays, but let's say for query like given in parent message comment:

SELECT 1 = ALL(ARRAY[1,1,1,1]);

We don't have two arrays, instead one scalar and one array. We can't use ArrayHas and ArrayHasAny as they don't seem to satisfy the exact condition of all elements matching, or they expect both parameters as arrays.

I see two ways out of this:

  • I expand a scalar like 1 in the case above to an array of the same size as inside ALL
  • I implement a different function for use in our case, something like ArrayHasAllElementsEqlTo for scalar use
    • And for cases where we have an array on the left, we use ArrayHasAll and make sure both arrays have the same length

My understanding is derived from trying a few queries on these playgrounds:

  • https://aiven.io/tools/pg-playground, and
  • https://www.w3schools.com/sql/sql_any_all.asp

Do correct me if I am understanding this wrong πŸ˜…

feniljain avatar Dec 12 '25 19:12 feniljain

Thanks for the detailed investigation @feniljain

You're right in that it seems ArrayHasAll doesn't seem to be usable out of the box as I initially thought. It seems ArrayHasAll is geared towards checking that array B is a subset of array A; so even if array B is a single element then as long as array A has one element of that array B then it will return true, which isn't what we want πŸ€”

We might need to create a new function entirely as you suggested. I might need some more time to think on this as I've only taken a quick look so far.

Jefffrey avatar Dec 13 '25 16:12 Jefffrey

Actually thinking about it more, perhaps this is related to:

  • https://github.com/apache/datafusion/issues/2548

In that these SQLs are equivalent:

SELECT 1 = ALL(ARRAY[1,1,1,1]);
SELECT NOT (1 <> ANY(ARRAY[1,2,3,4]))

So maybe we'd extend array_has to support !=/<> and then simplify ALL to be the equivalent of using <> ANY but placing a NOT on top of the final result? πŸ€”

Jefffrey avatar Dec 14 '25 04:12 Jefffrey

I tried out some queries to make sense of it, like:

salary > ALL (SELECT salary FROM employees WHERE department = 'HR')

is equivalent to:

NOT (
    salary <= ANY (SELECT salary FROM employees WHERE department = 'HR')
)

And yes it makes sense, we can definitely replace ALL with ANY + NOT + inverse operator. I see other issue is already being worked by @darshanime, maybe I will unassign myself here? If anything else is needed, would be glad to jump back in :)

feniljain avatar Dec 15 '25 17:12 feniljain