materialize icon indicating copy to clipboard operation
materialize copied to clipboard

Add more cases to `introduces_nulls`

Open ggevay opened this issue 3 years ago • 2 comments

BinaryFunc::introduces_nulls and VariadicFunc::introduces_nulls is presently conservative, i.e. may indicate that a function introduces nulls even when it does not. We should go through all binary and variadic functions and decide for each of them whether it can introduce nulls.

ggevay avatar Jul 28 '22 10:07 ggevay

Hi, I can take a look at this task.

OscarTHZhang avatar Aug 08 '22 15:08 OscarTHZhang

That would be great, thx!

ggevay avatar Aug 08 '22 15:08 ggevay

Hey @ggevay, could you give me an example of a function that may introduces nulls or a function that does not? Thanks

OscarTHZhang avatar Aug 22 '22 02:08 OscarTHZhang

Oh actually I saw that

        match self {
            Eq | NotEq | Lt | Lte | Gt | Gte | ArrayContains => {
                ScalarType::Bool.nullable(in_nullable)
            }

These are in-nullable functions so I should probably remove them from the introduces_nulls function?

OscarTHZhang avatar Aug 22 '22 02:08 OscarTHZhang

could you give me an example of a function that may introduces nulls or a function that does not?

The current implementations of BinaryFunc::introduces_nulls and VariadicFunc::introduces_nulls list some functions that cannot introduce nulls. For example, Lt (i.e., "less than") cannot introduce nulls: if we have an expression a < b, then it cannot happen that neither a nor b are nulls but somehow a < b ends up being null.

An example in the other direction would be ListIndex, which can introduce nulls: It can happen that lst[i] gives null even if neither lst nor i are nulls. Namely, if the index is invalid, then the result will be null, e.g.:

materialize=> SELECT LIST[1,2,3][-5] IS NULL;
 ?column? 
----------
 t
(1 row)

Oh actually I saw that

   match self {
       Eq | NotEq | Lt | Lte | Gt | Gte | ArrayContains => {
           ScalarType::Bool.nullable(in_nullable)
       }

These are in-nullable functions so I should probably remove them from the introduces_nulls function?

No, they shouldn't necessarily be removed. Nullable does not mean that it introduces nulls. Nullable just means that the result can be null. For example Lt cannot introduce null (see above), but it is nullable: for a < b, if a or b are null then the result will be null.

ggevay avatar Aug 22 '22 19:08 ggevay

Maybe a simpler way to explain "introduces null" is to say that "it might make a null result from non-null inputs".

ggevay avatar Aug 22 '22 19:08 ggevay

Addressing this in #17937 as a way to warm up to the codebase.

A quick question: MapGetValues will never introduce a null at the top level, but it may introduce subsidiary nulls deeper in the array returned (implementation in expr/src/scalar/func.rs). I listed this as introducing nulls, but wasn't confident that was the right choice...

mgree avatar Mar 03 '23 21:03 mgree

(I answered on the PR.)

ggevay avatar Mar 04 '23 13:03 ggevay

Mostly done by https://github.com/MaterializeInc/materialize/pull/17937

A remaining todo is tracked in https://github.com/MaterializeInc/materialize/issues/18219

ggevay avatar Mar 17 '23 16:03 ggevay