datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Signature::Equal for `NVL` and `NullIf`

Open jayzhan211 opened this issue 1 year ago • 3 comments

Is your feature request related to a problem or challenge?

NVL and Nullif has signature uniform, which iterates valid types and creates many possible types for them

TypeSignature::Uniform(number, valid_types) => valid_types
            .iter()
            .map(|valid_type| (0..*number).map(|_| valid_type.clone()).collect())
            .collect(),

I think they can have better signatures that take repeated first type as the valid types.

        TypeSignature::Equal(num) => {
            vec![vec![current_types[0]; num]]
        }

I think there are two benefits

  1. only one valid type vector and it is good enough
  2. no need to maintain supported types

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

jayzhan211 avatar Apr 28 '24 12:04 jayzhan211

I think this would not need a lot of changes and I want to take it. :)

yyy1000 avatar Apr 28 '24 17:04 yyy1000

After a initial try, there is an issue shown below:

statement ok CREATE TABLE test( int_field INT, bool_field BOOLEAN, text_field TEXT, more_ints INT ) as VALUES (1, true, 'abc', 2), (2, false, 'def', 2), (3, NULL, 'ghij', 3), (NULL, NULL, NULL, 4), (4, false, 'zxc', 5), (NULL, true, NULL, 6) ;

query I rowsort SELECT NULLIF(int_field, 2) FROM test;

External error: query failed: DataFusion error: Error during planning: No function matches the given name and argument types 'nvl(Int32, Int64)'. You might need to add explicit type casts.

I think it's because the first type is INT(32), however datafusion will treat the const 2 as int64.🤔 I will also see whether there is a solution.

Edit: I think adding CAST would resolve this, would it be a good way? SELECT NULLIF(int_field, CAST(2 as INT)) FROM test;

yyy1000 avatar Apr 28 '24 18:04 yyy1000

I checked the behavior in Postgres and found it doing casting implicitly, maybe the idea of Signature::Equal is not suitable for nullif. 😢

create table t1 (a int);

postgres=# select pg_typeof(nullif(a, 3.1)) from t1;
 pg_typeof 
-----------
 numeric
 numeric
 numeric
 numeric
 numeric
(5 rows)

postgres=# select pg_typeof(nullif(a, '2')) from t1;
 pg_typeof 
-----------
 integer
 integer
 integer
 integer
 integer
(5 rows)

postgres=# select pg_typeof(nullif(a, 2)) from t1;
 pg_typeof 
-----------
 integer
 integer
 integer
 integer
 integer
(5 rows)

jayzhan211 avatar Apr 29 '24 00:04 jayzhan211

Got it, it seems that datafusion can't do casting implicitly now, so the idea may be not available now. 👀

yyy1000 avatar Apr 29 '24 05:04 yyy1000

I think the coercion logic here is similar to coalesce and array. https://github.com/apache/datafusion/issues/10261#issuecomment-2081465710

jayzhan211 avatar Apr 29 '24 06:04 jayzhan211