crate icon indicating copy to clipboard operation
crate copied to clipboard

Make error handling for missing keys in objects consistent (lazy vs. eager)

Open mfussenegger opened this issue 1 year ago • 4 comments

Problem Statement

Some statements always raise an error if an object key is missing:

select obj['x'] from tbl

Others raise an error at runtime, but only if the table isn't empty:

cr> select o['x'] from (select {y=1} o from generate_series(0, 1) t) t1;
ColumnUnknownException[The object `{y=1}` does not contain the key `x`]
cr> select o['x'] from (select {y=1} o from generate_series(0, -1) t) t1;
+--------+
| o['x'] |
+--------+
+--------+

Or:

select obj['x'] from (select obj from tbl) as t

There are probably more such cases.

See also: https://github.com/crate/crate/pull/14593#pullrequestreview-1666602000

Possible Solutions

Make the behavior consistent so that it always raises an error, also on empty tables.

Considered Alternatives

  • Keep it inconsistent: Might lead to users missing errors during dev

mfussenegger avatar Oct 10 '23 07:10 mfussenegger

It seems like while on can define explicit casts for objects including sub-columns this information is dropped:

Object Literal

SELECT pg_typeof({x=1}['x']);
-- undefined
SELECT pg_typeof(({x=1}::OBJECT AS (x LONG))['x']);
-- undefined

Function returning an Object with defined sub-columns

CREATE FUNCTION my_func()
    RETURNS OBJECT AS (x LONG)
    LANGUAGE JAVASCRIPT
    AS $$
    function my_func(){
      return {x:1};
    };$$;

SELECT my_func();
+-----------------+
| {"x"=1::bigint} |
+-----------------+
| {"x": 1}        |
+-----------------+

SELECT pg_typeof(my_func()['x']);
+-------------+
| 'undefined' |
+-------------+
| undefined   |
+-------------+

proddata avatar Jun 18 '24 08:06 proddata

A side-effect of this is that something like:

CREATE TABLE test1 AS
SELECT '{"field1":123}'::OBJECT (STRICT) AS (field1 BIGINT) ['field1'];

results on

UnsupportedFeatureException[Type `undefined` does not support storage]

hlcianfagna avatar Jun 18 '24 09:06 hlcianfagna

Another case is the definition of object literals in views, such as:

CREATE TABLE data (value_1 BOOLEAN, value_2 INTEGER);

CREATE OR REPLACE VIEW data_view AS
SELECT { value_1 = value_1, value_2 = value_2 } AS o
FROM data;

SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'data_view';
+-------------+-----------+
| column_name | data_type |
+-------------+-----------+
| o           | object    |
+-------------+-----------+

o is returned as object (which is correct), but metadata on the subcolumns value_1 and value_2 are missing.

hammerhead avatar Jun 19 '24 07:06 hammerhead

Lookup on object with cast to object(ignored) raises errors:

create table tbl (o object(ignored));
insert into tbl (o) values ({x=10, y=20}), ({z=30});
refresh table tbl;
select o['x'] from tbl; -- works
select o::object(ignored)['x'] from tbl; --raises ColumnUnknownException[The object `{z=30}` does not contain the key `x`]

This is due to o['x'] being a DynamicReference in the first case. In the second case because of the cast it results in a internal subscript(o, 'x'). Currently ObjectType doesn't carry the column policy, so we do not know it is ignored.

ColumnPolicy should probably move from being a property on Reference to being a property on ObjectType - if that doesn't cause other issues (e.g. with object arrays) (As attempted in https://github.com/crate/crate/pull/14593)

mfussenegger avatar Jul 01 '24 09:07 mfussenegger