flink icon indicating copy to clipboard operation
flink copied to clipboard

[FLINK-31663][table] Add-ARRAY_EXCEPT-function.

Open hanyuzheng7 opened this issue 1 year ago • 2 comments

What is the purpose of the change

This is an implementation of ARRAY_EXCEPT function ARRAY_EXCEPT(array1, array2) - Returns an array of the elements in array1 but not in array2, without duplicates.

Brief change log

ARRAY_EXCEPT for Table API and SQL

Syntax:

ARRAY_EXCEPT(array1, array2)

Arguments:

array: An ARRAY to be handled.

Returns:

Returns an array of the elements in array1 but not in array2, without duplicates.

Examples:

Flink SQL> SELECT array_except(array[1,2,2], array[2,3,4]);
[1]

Flink SQL> SELECT array_except(array[1,2,2], array[1]);
[2]

Flink SQL> SELECT array_except(array[1,2,2], array[42]);
[1, 2]

Flink SQL> SELECT array_except(array[1,2,2], cast(null as array<int>));
[1, 2]

Flink SQL> SELECT array_except(array[1,2,2], array[null,2]);
[1]

Flink SQL> SELECT array_except(cast(null as array<int>), array[1,2,3]);
<NULL>

Flink SQL> SELECT array_except(array[null,null,1], array[42]);
[NULL, 1]

Flink SQL> SELECT array_except(array[null,null,1], array[null, 42]);
[1]

Flink SQL> SELECT array_except(array[(TRUE, DATE '2022-04-20'), (TRUE, DATE '1990-10-14'), null], array[(TRUE, DATE '1990-10-14')]);
[(TRUE, 2022-04-20), NULL]

Flink SQL> SELECT array_except(array[(TRUE, DATE '2022-04-20'), (TRUE, DATE '1990-10-14'), null], cast(null as array<row<col1 boolean, col2 date>>));
[(TRUE, 2022-04-20), (TRUE, 1990-10-14), NULL]

Flink SQL> SELECT array_except(array[array[1,null,3], array[0], array[1]], array[array[0]]);
[[1, NULL, 3], [1]]

Flink SQL> SELECT array_except(array[map[1, 'a', 2, 'b'], map[3, 'c', 4, 'd']], array[map[3, 'c', 4, 'd']]);
[{1=a, 2=b}]

// Error message with the CommonArrayInputStrategy. Without the CommonArrayInputStrategy the output is [1]
Flink SQL> SELECT array_except(array[1], array['this is a string']);
[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.table.api.ValidationException: Could not find a common type for arguments: [ARRAY<INT NOT NULL> NOT NULL, ARRAY<CHAR(16) NOT NULL> NOT NULL]

Verifying this change

CollectionFunctionsITCase

see also:

spark:https://spark.apache.org/docs/latest/api/sql/index.html#array_except

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

hanyuzheng7 avatar Aug 08 '23 21:08 hanyuzheng7

CI report:

  • fadf1bb39b0a63e609937a3b63250b078b89b8c3 Azure: SUCCESS
Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

flinkbot avatar Aug 08 '23 21:08 flinkbot

@flinkbot run azure

hanyuzheng7 avatar Feb 13 '24 00:02 hanyuzheng7