postgrest
postgrest copied to clipboard
Confused on computed columns and relationships
Environment
- PostgreSQL version: not relevant
- PostgREST version: v10.0.0
- Operating system: not relevant
Description of issue
The current implementation allows the mixed usage of computed columns and relationships, i.e. both the queries of /premieres?select=*,film(*)
and /premieres?select=*,film
return something.
After some experiments, I got a bit confused since some of the results seem unexpected and unreasonable to me.
With the fixture
create schema api;
create table api.premieres (
id text primary key,
film_id text -- references api.films(id)
);
insert into api.premieres values
('P1', 'F1'),
('P2', 'F1'),
('P3', 'F2'),
('P4', 'F3')
;
create table api.films(id text primary key);
insert into api.films values('F1'), ('F2'), ('F4');
and the functions
create or replace function api.computed_films_m2o(api.premieres) returns setof api.films as $$
select * from api.films where id = $1.film_id
$$ stable language sql rows 1;
create or replace function api.computed_films_scalar(api.premieres) returns api.films as $$
select * from api.films where id = $1.film_id
$$ stable language sql;
create or replace function api.computed_films_m2m(api.premieres) returns setof api.films as $$
select * from api.films
$$ stable language sql;
create or replace function api.computed_films_m2m_marked_as_m2o(api.premieres) returns setof api.films as $$
select * from api.films
$$ stable language sql rows 1;
create or replace function api.computed_premiers_o2m(api.films) returns setof api.premieres as $$
select * from api.premieres where film_id = $1.id
$$ stable language sql;
create or replace function api.computed_premiers_scalar(api.films) returns api.premieres as $$
select * from api.premieres where film_id = $1.id
$$ stable language sql;
create or replace function api.computed_premiers_o2m_marked_as_o2o(api.films) returns setof api.premieres as $$
select * from api.premieres where film_id = $1.id
$$ stable language sql rows 1;
let's access the functions with both the column and embedding syntaxes.
Here are some test results:
GET /premieres?select=*,computed_films_m2o(*)
-- expected usage and expected results
[{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_m2o":null}]
GET /premieres?select=*,computed_films_m2o
-- questionable usage, unexpected results (inner join)
[{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}}
GET /premieres?select=*,computed_films_m2o!inner(*)
-- expected usage and expected results
[{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}}
GET /premieres?select=*,computed_films_m2o!inner
-- rogue usage
[{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}}
GET /premieres?select=*,computed_films_scalar(*)
-- questionable usage, expected results (taken as m2m relationship)
[{"id":"P1","film_id":"F1","computed_films_scalar":[{"id":"F1"}]},
{"id":"P2","film_id":"F1","computed_films_scalar":[{"id":"F1"}]},
{"id":"P3","film_id":"F2","computed_films_scalar":[{"id":"F2"}]},
{"id":"P4","film_id":"F3","computed_films_scalar":[{"id":null}]}]
GET /premieres?select=*,computed_films_scalar
-- expected usage and expected results
[{"id":"P1","film_id":"F1","computed_films_scalar":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_scalar":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_scalar":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_scalar":null}]
GET /premieres?select=*,computed_films_scalar!inner(*)
-- questionable usage, unexpected results(inner join didn't work)
[{"id":"P1","film_id":"F1","computed_films_scalar":[{"id":"F1"}]},
{"id":"P2","film_id":"F1","computed_films_scalar":[{"id":"F1"}]},
{"id":"P3","film_id":"F2","computed_films_scalar":[{"id":"F2"}]},
{"id":"P4","film_id":"F3","computed_films_scalar":[{"id":null}]}]
GET /premieres?select=*,computed_films_scalar!inner
-- rogue usage
[{"id":"P1","film_id":"F1","computed_films_scalar":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_scalar":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_scalar":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_scalar":null}]
GET /premieres?select=*,computed_films_m2m(*)
-- expected usage and expected results
[{"id":"P1","film_id":"F1","computed_films_m2m":[{"id":"F1"}, {"id":"F2"}, {"id":"F4"}]},
{"id":"P2","film_id":"F1","computed_films_m2m":[{"id":"F1"}, {"id":"F2"}, {"id":"F4"}]},
{"id":"P3","film_id":"F2","computed_films_m2m":[{"id":"F1"}, {"id":"F2"}, {"id":"F4"}]},
{"id":"P4","film_id":"F3","computed_films_m2m":[{"id":"F1"}, {"id":"F2"}, {"id":"F4"}]}]
GET /premieres?select=*,computed_films_m2m
-- questionable usage, unexpected results (unnest inner join)
[{"id":"P1","film_id":"F1","computed_films_m2m":{"id":"F1"}},
{"id":"P1","film_id":"F1","computed_films_m2m":{"id":"F2"}},
{"id":"P1","film_id":"F1","computed_films_m2m":{"id":"F4"}},
{"id":"P2","film_id":"F1","computed_films_m2m":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2m":{"id":"F2"}},
{"id":"P2","film_id":"F1","computed_films_m2m":{"id":"F4"}},
{"id":"P3","film_id":"F2","computed_films_m2m":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2m":{"id":"F2"}},
{"id":"P3","film_id":"F2","computed_films_m2m":{"id":"F4"}},
{"id":"P4","film_id":"F3","computed_films_m2m":{"id":"F1"}},
{"id":"P4","film_id":"F3","computed_films_m2m":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_m2m":{"id":"F4"}}]
GET /premieres?select=*,computed_films_m2m_marked_as_m2o(*)
-- expect usage, unexpected results (unnested left join)
[{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F4"}}]
GET /premieres?select=*,computed_films_m2m_marked_as_m2o
-- questionable usage, unexpected results (unnested inner join)
[{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F4"}}]
GET /premieres?select=*,computed_films_m2m_marked_as_m2o!inner(*)
-- expected usage, unexpected results (unnested join, notice the order)
[{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F4"}}]
GET /premieres?select=*,computed_films_m2m_marked_as_m2o!inner
-- rogue usage
[{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F4"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F1"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F4"}}]
GET /films?select=*,computed_premiers_o2m(*)
-- expected usage and expected results
[{"id":"F1","computed_premiers_o2m":[{"id":"P1","film_id":"F1"}, {"id":"P2","film_id":"F1"}]},
{"id":"F2","computed_premiers_o2m":[{"id":"P3","film_id":"F2"}]},
{"id":"F4","computed_premiers_o2m":[]}]⏎
GET /films?select=*,computed_premiers_o2m
-- questionable usage, unexpected results (unnested inner join)
[{"id":"F1","computed_premiers_o2m":{"id":"P1","film_id":"F1"}},
{"id":"F1","computed_premiers_o2m":{"id":"P2","film_id":"F1"}},
{"id":"F2","computed_premiers_o2m":{"id":"P3","film_id":"F2"}}]
GET /films?select=*,computed_premiers_o2m!inner(*)
-- expected usage and expected results
[{"id":"F1","computed_premiers_o2m":[{"id":"P1","film_id":"F1"}, {"id":"P2","film_id":"F1"}]},
{"id":"F2","computed_premiers_o2m":[{"id":"P3","film_id":"F2"}]}]
GET /films?select=*,computed_premiers_o2m!inner
-- rogue usage
[{"id":"F1","computed_premiers_o2m":{"id":"P1","film_id":"F1"}},
{"id":"F1","computed_premiers_o2m":{"id":"P2","film_id":"F1"}},
{"id":"F2","computed_premiers_o2m":{"id":"P3","film_id":"F2"}}]
GET /films?select=*,computed_premiers_scalar(*)
-- questionable usage, unexpected results (scalar function always returns exactly one row even the result is empty or has multiple rows)
[{"id":"F1","computed_premiers_scalar":[{"id":"P1","film_id":"F1"}]},
{"id":"F2","computed_premiers_scalar":[{"id":"P3","film_id":"F2"}]},
{"id":"F4","computed_premiers_scalar":[{"id":null,"film_id":null}]}]
GET /films?select=*,computed_premiers_scalar
-- expected usage and expected results
[{"id":"F1","computed_premiers_scalar":{"id":"P1","film_id":"F1"}},
{"id":"F2","computed_premiers_scalar":{"id":"P3","film_id":"F2"}},
{"id":"F4","computed_premiers_scalar":null}]
GET /films?select=*,computed_premiers_o2m_marked_as_o2o(*)
-- expected usage, unexected results (unnested left join)
[{"id":"F1","computed_premiers_o2m_marked_as_o2o":{"id":"P1","film_id":"F1"}},
{"id":"F1","computed_premiers_o2m_marked_as_o2o":{"id":"P2","film_id":"F1"}},
{"id":"F2","computed_premiers_o2m_marked_as_o2o":{"id":"P3","film_id":"F2"}},
{"id":"F4","computed_premiers_o2m_marked_as_o2o":null}]
GET /films?select=*,computed_premiers_o2m_marked_as_o2o
-- questionable usage, unexpected results (unnested inner join)
[{"id":"F1","computed_premiers_o2m_marked_as_o2o":{"id":"P1","film_id":"F1"}},
{"id":"F1","computed_premiers_o2m_marked_as_o2o":{"id":"P2","film_id":"F1"}},
{"id":"F2","computed_premiers_o2m_marked_as_o2o":{"id":"P3","film_id":"F2"}}]
Among them, a lot of cases have either questionable usage (e.g., a column access of a relationship, /premieres?select=*,computed_films_m2o
) or unexpected results (e.g., /premieres?select=*,computed_films_m2o
produces inner-joined rows; /premieres?select=*,computed_films_m2m
and /premieres?select=*,computed_films_m2m_marked_as_m2o(*)
produce unnested rows).
The problem is probably a direct consequence of the flexible, powerful yet confusing handling of function expression in postgres (c.f., 7.2.1.4. Table Functions, 38.5.8. SQL Functions as Table Sources, 38.5.9. SQL Functions Returning Sets).
I think it requires thorough considerations to sort things out. To put my two cents,
- disallow accessing the set-returning functions (
returns setof
,returns table
) as computed columns; - disallow accessing the scalar functions (
returns <rowtype>
) as computed relationships; - explicitly document that a purposely or accidentally wrong row estimation (
rows 1
) could lead unexpected results; -
!inner
can appear after column names, this should be a bug (see below).
GET /premieres?select=*!inner,computed_films_m2o(*)
[{"id":"P1","film_id":"F1"},
{"id":"P2","film_id":"F1"},
{"id":"P3","film_id":"F2"},
{"id":"P4","film_id":"F3"}]
Wow, very nice and thorough investigation! Maybe this will turn too big for a single issue, I'll open new ones if required. First, let's address the issues you mentioned.
let's access the functions with both the column and embedding syntaxes.
The valid embedding syntax has this form:
GET /premieres?select=*,computed_films_m2o(*)
That is, it must have the parenthesis ()
or it will only be considered as a computed column instead (even though a similar result may be returned). That is why modifiers like !inner
, !target
and !hint
are ignored which explains some of the responses you got. So, we can say that the following results from your list are not unexpected:
GET /premieres?select=*,computed_films_m2o
-- questionable usage, unexpected results (inner join)
GET /premieres?select=*,computed_films_m2m
-- questionable usage, unexpected results (unnest inner join)
GET /premieres?select=*,computed_films_m2m_marked_as_m2o
-- questionable usage, unexpected results (unnested inner join)
GET /films?select=*,computed_premiers_o2m
-- questionable usage, unexpected results (unnested inner join)
GET /films?select=*,computed_premiers_o2m_marked_as_o2o
-- questionable usage, unexpected results (unnested inner join)
The obtained "scalar" results could be a bug. When returning a table it always assumes that more than one row is returned, that is, a to-many
embedding will always be detected, as seen in this line:
https://github.com/PostgREST/postgrest/blob/38ecaa44db0dfe04709a3ee277cd58abcb4a0b61/src/PostgREST/DbStructure.hs#L719
The relationships are determined with the ROWS
estimate, but table returns don't have that, I'm not sure if there's an alternative for these cases. Otherwise we could detect returned tables either as to-many
or to-one
(one of those) or avoid detecting tables altogether as you suggested. These are the affected examples:
GET /premieres?select=*,computed_films_scalar(*)
-- questionable usage, expected results (taken as m2m relationship)
GET /premieres?select=*,computed_films_scalar!inner(*)
-- questionable usage, unexpected results(inner join didn't work)
GET /films?select=*,computed_premiers_scalar(*)
-- questionable usage, unexpected results (scalar function always returns exactly one row even the result is empty or has multiple rows)
For the case of "unnested" results, the ROWS 1
estimation will make PostgREST to expect a single value to be returned. If that's not the case, the unnesting will occur. Still, this case may need to be documented as you suggested.
GET /films?select=*,computed_premiers_o2m_marked_as_o2o(*)
-- expected usage, unexected results (unnested left join)
GET /premieres?select=*,computed_films_m2m_marked_as_m2o(*)
-- expect usage, unexpected results (unnested left join)
GET /premieres?select=*,computed_films_m2m_marked_as_m2o!inner(*)
-- expected usage, unexpected results (unnested join, notice the order)
With all that said, let's address your suggestions:
- disallow accessing the set-returning functions (returns setof, returns table) as computed columns;
I don't agree here, taking into consideration that embedding functionality is only expected when ()
is present. The result is exactly as if a computed column is selected directly in the query, e.g.:
GET /premieres?select=*,computed_films_m2m
This is the expected result that you'd get from:
SET search_path TO api;
SELECT "api"."premieres".*, "api"."premieres"."computed_films_m2m"
FROM "api"."premieres";
I think we should give the user some freedom to use the computed column directly if they need to.
- disallow accessing the scalar functions (returns
) as computed relationships;
This is a valid suggestion as seen in the "scalar" examples. There are three alternatives for these functions with returns <rowtype>
:
- Find a way to detect if a single result is obtained (not sure if possible)
- Accept it as either a
to-many
orto-one
embedding (I prefer the first one) - Disallowing them altogether (as you mentioned)
3- explicitly document that a purposely or accidentally wrong row estimation (rows 1) could lead unexpected results;
I agree, it could be added to the documentation as a warning.
!inner
can appear after column names, this should be a bug (see below).GET /premieres?select=*!inner,computed_films_m2o(*) ...
This has to do with the leniency of the parser and not with embedding. That !inner
in the example is not modifying an embedding resource, but a table column. In this case, the parser will ignore everything after (and including) the !
. For instance, you'll get the same result if you request:
GET /premieres?select=*!everythingafterthiswillbeignored,computed_films_m2o(*)
It's related to the findings here https://github.com/PostgREST/postgrest/issues/2362#issuecomment-1242554835.
This explains the results you got here:
GET /premieres?select=*,computed_films_m2o!inner
-- rogue usage
GET /premieres?select=*,computed_films_scalar!inner
-- rogue usage
GET /premieres?select=*,computed_films_m2m_marked_as_m2o!inner
-- rogue usage
GET /films?select=*,computed_premiers_o2m!inner
-- rogue usage
I'll wait for your feedback before creating the new issues.
Wow, very nice and thorough investigation! Maybe this will turn too big for a single issue, I'll open new ones if required. First, let's address the issues you mentioned.
Indeed we have several issues here.
That is, it must have the parenthesis
()
or it will only be considered as a computed column instead (even though a similar result may be returned). That is why modifiers like!inner
,!target
and!hint
are ignored which explains some of the responses you got. So, we can say that the following results from your list are not unexpected:GET /premieres?select=*,computed_films_m2o -- questionable usage, unexpected results (inner join) [{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}}
I see that taking a set-returning function as a computed column is acceptable, but the result is still unexpected (to me). Intuitively (and conventionally with PostgREST), a query with computed columns returns the same number of rows as the main query, as in a left join. That is, if we accept taking the computed_films_m2o
as a computed column, it should return
GET /premieres?select=*,computed_films_m2o
[{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_m2o":null}]
instead. Otherwise, we have to break the current behevior of that a computed column never change the shape of the main query.
But the presented inner join is not a bug, but a postgresql feature, because PostgreSQL's behavior for a set-returning function in a query's select list is almost exactly the same as if the set-returning function had been written in a LATERAL FROM-clause item instead, i.e.,
select *, computed_films_m2o(premieres) from premieres
is almost the same as
select premieres.*, films from premieres, lateral computed_films_m2o(premieres) as films
Regarding that, we could either exploit it as a postgrest feature that a set-returning function as computed column is for inner-join and a rowtype-returning scalar function as computed column is for left-join, or simply deny a set-returning function taken as a computed column for less headache.
Personally, I think the former option is very interesting and makes postgrest more flexible, although it requires explicit documentation and more tests and experiments.
The obtained "scalar" results could be a bug. When returning a table it always assumes that more than one row is returned, that is, a
to-many
embedding will always be detected, as seen in this line:https://github.com/PostgREST/postgrest/blob/38ecaa44db0dfe04709a3ee277cd58abcb4a0b61/src/PostgREST/DbStructure.hs#L719
The relationships are determined with the
ROWS
estimate, but table returns don't have that, I'm not sure if there's an alternative for these cases. Otherwise we could detect returned tables either asto-many
orto-one
(one of those) or avoid detecting tables altogether as you suggested. These are the affected examples:GET /premieres?select=*,computed_films_scalar(*) -- questionable usage, expected results (taken as m2m relationship)
If we want a rowtype-returning scalar function used as a computed relationship, my suggestion would be marking it as in rows 1
, because such a function always returns one row, no more, no less.
-- check the film existence
select f from api.films f where id = 'F1';
-- f
-- ------
-- (F1)
-- (1 row)
select f from api.films f where id = 'F3';
-- f
-- ---
-- (0 rows)
--- the differences between set-returning function and scalar function
-- scalar function returns one row even there are two matching rows
select * from api.computed_premiers_scalar((select f from api.films f where id = 'F1'));
-- id | film_id
-- ----+---------
-- P1 | F1
-- (1 row)
-- set-returning function returns all the matching rows
select * from api.computed_premiers_o2m((select f from api.films f where id = 'F1'));
-- id | film_id
-- ----+---------
-- P1 | F1
-- P2 | F1
-- (2 rows)
-- scalar function returns one row even there are no matching row
select * from api.computed_premiers_scalar((select f from api.films f where id = 'F3'));
-- id | film_id
-- ----+---------
-- ¤ | ¤
-- (1 row)
-- set-returning function returns no rows
select * from api.computed_premiers_o2m((select f from api.films f where id = 'F3'));
-- id | film_id
-- ----+---------
-- (0 rows)
With all that said, let's address your suggestions:
- disallow accessing the set-returning functions (returns setof, returns table) as computed columns;
I don't agree here, taking into consideration that embedding functionality is only expected when
()
is present. The result is exactly as if a computed column is selected directly in the query, e.g.:GET /premieres?select=*,computed_films_m2m
This is the expected result that you'd get from:
SET search_path TO api; SELECT "api"."premieres".*, "api"."premieres"."computed_films_m2m" FROM "api"."premieres";
I think we should give the user some freedom to use the computed column directly if they need to.
Yep, it is the expected result from postgresql. The unexpected part is that the result is executed as inner-join, which is not consistent with the current postgrest convention (computed column doesn't affect the shape of main result). AFAIA, only !inner
embedding and filters could alter the shape (the returning rows) of the main query.
As suggested in my previous comment, it could be an interesting abuse if we accept the column syntax to filter the main result. We will have another way to do inner join besides the !inner
modifier. Furthermore, with a to-many
relationship intentionally marked as rows 1
, we also have a way to have a flattened (unnested) to-many
embeddings.
- disallow accessing the scalar functions (returns ) as computed relationships;
This is a valid suggestion as seen in the "scalar" examples. There are three alternatives for these functions with
returns <rowtype>
:* Find a way to detect if a single result is obtained (not sure if possible) * Accept it as either a `to-many` or `to-one` embedding (I prefer the first one) * Disallowing them altogether (as you mentioned)
I prefer a to-one
embedding since scalar function always returns exactly 1 row.
Or, a dust-to-dust decision to disallow scalar function as computed relationship seems more clear to me.
!inner
can appear after column names, this should be a bug (see below).GET /premieres?select=*!inner,computed_films_m2o(*) ...
This one could be even ignored since it seems to have no real impact.
I'll wait for your feedback before creating the new issues.
Thanks for the comments!
I see that taking a set-returning function as a computed column is acceptable, but the result is still unexpected (to me). Intuitively (and conventionally with PostgREST), a query with computed columns returns the same number of rows as the main query, as in a left join.
Ah, I get it. From my POV I see it more using a PostgreSQL perspective. Also, this was allowed even before the computed relationships were implemented. I just tested GET /premieres?select=*,computed_films_m2o
on PostgREST v9.0.1 and got the same result, returning a different number of rows than the main query (it may work the same way for older versions), so changing this behavior could be considered a breaking change. Still, it needs consensus, I think.
If we want a rowtype-returning scalar function used as a computed relationship, my suggestion would be marking it as in
rows 1
, because such a function always returns one row, no more, no less.
Oh, I didn't notice that. Then yes, if the decision is to accept the scalar function then a to-one
would be the best choice.
Cool! with that feedback I'll open the new issues and continue the discussion for each one of them there.
I have created #2537 and #2540, which solve some of the issues here. I'll go through the non-"expected usage and expected results" cases - mostly for myself to see what is still missing:
GET /premieres?select=*,computed_films_m2o -- questionable usage, unexpected results (inner join) [{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}}
This is discussed in #2480 and requires considerably more work. Not changed, so far.
GET /premieres?select=*,computed_films_m2o!inner -- rogue usage [{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}}
Hints and join type specifications are disallowed on computed columns in #2537.
This would throw a parsing error now.
GET /premieres?select=*,computed_films_scalar(*) -- questionable usage, expected results (taken as m2m relationship) [{"id":"P1","film_id":"F1","computed_films_scalar":[{"id":"F1"}]}, {"id":"P2","film_id":"F1","computed_films_scalar":[{"id":"F1"}]}, {"id":"P3","film_id":"F2","computed_films_scalar":[{"id":"F2"}]}, {"id":"P4","film_id":"F3","computed_films_scalar":[{"id":null}]}]
This is taken as a M2O/O2M relationship with proper object (instead of array) return in #2540.
GET /premieres?select=*,computed_films_scalar!inner(*) -- questionable usage, unexpected results(inner join didn't work) [{"id":"P1","film_id":"F1","computed_films_scalar":[{"id":"F1"}]}, {"id":"P2","film_id":"F1","computed_films_scalar":[{"id":"F1"}]}, {"id":"P3","film_id":"F2","computed_films_scalar":[{"id":"F2"}]}, {"id":"P4","film_id":"F3","computed_films_scalar":[{"id":null}]}]
The inner join works in #2540.
GET /premieres?select=*,computed_films_scalar!inner -- rogue usage [{"id":"P1","film_id":"F1","computed_films_scalar":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_scalar":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_scalar":{"id":"F2"}}, {"id":"P4","film_id":"F3","computed_films_scalar":null}]
Disallowed in #2537 (see above).
GET /premieres?select=*,computed_films_m2m -- questionable usage, unexpected results (unnest inner join) [{"id":"P1","film_id":"F1","computed_films_m2m":{"id":"F1"}}, {"id":"P1","film_id":"F1","computed_films_m2m":{"id":"F2"}}, {"id":"P1","film_id":"F1","computed_films_m2m":{"id":"F4"}}, {"id":"P2","film_id":"F1","computed_films_m2m":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_m2m":{"id":"F2"}}, {"id":"P2","film_id":"F1","computed_films_m2m":{"id":"F4"}}, {"id":"P3","film_id":"F2","computed_films_m2m":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_m2m":{"id":"F2"}}, {"id":"P3","film_id":"F2","computed_films_m2m":{"id":"F4"}}, {"id":"P4","film_id":"F3","computed_films_m2m":{"id":"F1"}}, {"id":"P4","film_id":"F3","computed_films_m2m":{"id":"F2"}}, {"id":"P4","film_id":"F3","computed_films_m2m":{"id":"F4"}}]
Discussed in #2480 (see above).
GET /premieres?select=*,computed_films_m2m_marked_as_m2o(*) -- expect usage, unexpected results (unnested left join) [{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F4"}}]
I'm fine with that. I'd be fine with this throwing some kind of error, too. This is a bit similar to how we throw errors when requesting a resource with the single object header and still getting multiple rows back.
GET /premieres?select=*,computed_films_m2m_marked_as_m2o -- questionable usage, unexpected results (unnested inner join) [{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F4"}}]
Discussed in #2480 (see above).
GET /premieres?select=*,computed_films_m2m_marked_as_m2o!inner(*) -- expected usage, unexpected results (unnested join, notice the order) [{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F4"}}]
Without an order by, order is undefined per postgresql. Same as above.
GET /premieres?select=*,computed_films_m2m_marked_as_m2o!inner -- rogue usage [{"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P1","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P2","film_id":"F1","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P3","film_id":"F2","computed_films_m2m_marked_as_m2o":{"id":"F4"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F1"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F2"}}, {"id":"P4","film_id":"F3","computed_films_m2m_marked_as_m2o":{"id":"F4"}}]
Disallowed in #2537 (see above).
GET /films?select=*,computed_premiers_o2m -- questionable usage, unexpected results (unnested inner join) [{"id":"F1","computed_premiers_o2m":{"id":"P1","film_id":"F1"}}, {"id":"F1","computed_premiers_o2m":{"id":"P2","film_id":"F1"}}, {"id":"F2","computed_premiers_o2m":{"id":"P3","film_id":"F2"}}]
Discussed in #2480 (see above).
GET /films?select=*,computed_premiers_o2m!inner -- rogue usage [{"id":"F1","computed_premiers_o2m":{"id":"P1","film_id":"F1"}}, {"id":"F1","computed_premiers_o2m":{"id":"P2","film_id":"F1"}}, {"id":"F2","computed_premiers_o2m":{"id":"P3","film_id":"F2"}}]
Disallowed in #2537 (see above).
GET /films?select=*,computed_premiers_scalar(*) -- questionable usage, unexpected results (scalar function always returns exactly one row even the result is empty or has multiple rows) [{"id":"F1","computed_premiers_scalar":[{"id":"P1","film_id":"F1"}]}, {"id":"F2","computed_premiers_scalar":[{"id":"P3","film_id":"F2"}]}, {"id":"F4","computed_premiers_scalar":[{"id":null,"film_id":null}]}]
Changed in #2540 (see above).
GET /films?select=*,computed_premiers_o2m_marked_as_o2o(*) -- expected usage, unexected results (unnested left join) [{"id":"F1","computed_premiers_o2m_marked_as_o2o":{"id":"P1","film_id":"F1"}}, {"id":"F1","computed_premiers_o2m_marked_as_o2o":{"id":"P2","film_id":"F1"}}, {"id":"F2","computed_premiers_o2m_marked_as_o2o":{"id":"P3","film_id":"F2"}}, {"id":"F4","computed_premiers_o2m_marked_as_o2o":null}]
(see comment above about unnested left join)
GET /films?select=*,computed_premiers_o2m_marked_as_o2o -- questionable usage, unexpected results (unnested inner join) [{"id":"F1","computed_premiers_o2m_marked_as_o2o":{"id":"P1","film_id":"F1"}}, {"id":"F1","computed_premiers_o2m_marked_as_o2o":{"id":"P2","film_id":"F1"}}, {"id":"F2","computed_premiers_o2m_marked_as_o2o":{"id":"P3","film_id":"F2"}}]
Discussed in #2480 (see above).
And the conclusions:
1. disallow accessing the set-returning functions (`returns setof`, `returns table`) as computed columns;
#2480.
2. disallow accessing the scalar functions (`returns <rowtype>`) as computed relationships;
Done differently in #2540.
3. explicitly document that a purposely or accidentally wrong row estimation (`rows 1`) could lead unexpected results;
Yeah, that seems like the minimum! It's tracked in https://github.com/PostgREST/postgrest-docs/issues/570 already.
4. `!inner` can appear after column names, this should be a bug (see below).
Fixed in #2537.
My takeaway from this: All cases are tracked in other issues and/or addressed in PRs. Therefore, I'll close this one.