postgrest
postgrest copied to clipboard
Computed relationship changes the shape of main query when selecting it as a computed column
create schema api;
create table api.premieres (
id text primary key,
film_id text -- references api.films(id)
);
insert into api.premieres values
('P1', 'F1'),
('P2', 'F1'),
('P3', 'F2'),
('P4', 'F3')
;
create table api.films(id text primary key);
insert into api.films values('F1'), ('F2'), ('F4');
create or replace function api.computed_films_m2o(api.premieres) returns setof api.films as $$
select * from api.films where id = $1.film_id
$$ stable language sql rows 1;
I see that taking a set-returning function as a computed column is acceptable, but the result is still unexpected (to me). Intuitively (and conventionally with PostgREST), a query with computed columns returns the same number of rows as the main query, as in a left join. That is, if we accept taking the computed_films_m2o
as a computed column, it should return
GET /premieres?select=*,computed_films_m2o
[{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}},
{"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}},
{"id":"P4","film_id":"F3","computed_films_m2o":null}]
instead. Otherwise, we have to break the current behevior of that a computed column never change the shape of the main query.
But the presented inner join is not a bug, but a postgresql feature, because PostgreSQL's behavior for a set-returning function in a query's select list is almost exactly the same as if the set-returning function had been written in a LATERAL FROM-clause item instead, i.e.,
select *, computed_films_m2o(premieres) from premieres
is almost the same as
select premieres.*, films from premieres, lateral computed_films_m2o(premieres) as films
Regarding that, we could either exploit it as a postgrest feature that a set-returning function as computed column is for inner-join and a rowtype-returning scalar function as computed column is for left-join, or simply deny a set-returning function taken as a computed column for less headache.
Personally, I think the former option is very interesting and makes postgrest more flexible, although it requires explicit documentation and more tests and experiments.
Originally posted by @Iced-Sun in https://github.com/PostgREST/postgrest/issues/2475#issuecomment-1251843074
This was my take of the issue:
From my POV I see it more using a PostgreSQL perspective. Also, this was allowed even before the computed relationships were implemented. I just tested
GET /premieres?select=*,computed_films_m2o
on PostgREST v9.0.1 and got the same result, returning a different number of rows than the main query (it may work the same way for older versions), so changing this behavior could be considered a breaking change. Still, it needs consensus, I think.
So since computed relationships are computed columns they can be used on select
and thus filter the same way as an !inner
.
I think we should keep the convention of select
not filtering the rows. !inner
broke this but it was a mistake that we can correct with the exists
filter(https://github.com/PostgREST/postgrest/issues/2340)
or simply deny a set-returning function taken as a computed column for less headache.
I would prefer that option. This would require considerable work though because we don't have computed columns in the schema cache.
or simply deny a set-returning function taken as a computed column for less headache.
I would prefer that option. This would require considerable work though because we don't have computed columns in the schema cache.
I think we should make sure to make the computed relationships features able to call all kinds of computed columns (i.e. scalar types with empty parens select=*,computed()
and inline support for all types) and then deprecate the "implicit" way of calling computed columns without brackets. A next step could be to check all column names against the real columns of the queried relation and to reject calling computed columns without parens - i.e. disabling the deprecated feature.
This would give us a way out of the "select affects the number of rows returned" mess we have right now.
Summarizing this and the other issue:
We currently have two problems:
- In some situations we call set-returning functions in the
SELECT
part, which modifies the number of rows through select. This is the case with the old computed column syntax. - In other cases we call non-set-returning functions in the
FROM
part, which makes the function not inlineable. This is the case with the new syntax.
We can fix both by making the new syntax cover all cases and then remove the old syntax.