how-to for filtering by embedded resource count
I spent a long time looking through docs and issues trying to find a native way to filter a query result based on the array_length of an embedded resource.
This use case can't filter on the client because it requires pagination across thousands of results.
Based on https://github.com/PostgREST/postgrest/issues/1075, seems like it's not possible yet. I was trying a syntax like:
GET /parent?select=*,children(*)&limit=10&order=children->>length.desc
but this gives an error:
column parent.children does not exist
Eventually found this earlier issue https://github.com/PostgREST/postgrest/issues/845 laying out a very similar use-case, and this solution by @ruslantalpa using a new index & view:
One way to do it would be (but i am not completely sure it's optimal) first, do not expose the schema where your tables live, expose a schema
apiwhere you have only view definitionsthen have the last function that you defined somewhere and a view like
create or replace view api.films as select id, name, data.film_count_nominations from data.films;then add a index like this
create index on data.films (data.film_count_nominations(data.films.*));hopefully this index will prevent running that count query on each request.
If the index trick does not work then change the definition of the
api.filmsview to be a join between films and nominations so that you can calculate that count columnit would be ideal to have the index thing work and not rely on a join because for cases when you do not need to filter by that count, the view is a simple select from the source table and not a join.
...
this thing is off the top of my head so take it with the grain of salt, but the main idea is correct, you have to bring up the "child property" up to the level of the parent so that you can filter based on it as if it's a property of the parent
Another potential solution, by @begriffs, using an SQL function to solve a similar problem: https://github.com/PostgREST/postgrest/issues/443#issuecomment-206162267
Now about adding a full_name column to your table, I think you can do it without making a whole view. If you define a function with the right name and argument type then postgresql will let you use it as a computed column. For instance:
CREATE FUNCTION full_name(users) RETURNS varchar LANGUAGE sql STABLE AS $_$ SELECT $1.first_name || ' ' || $1.last_name $_$;Once this function exists the client can use it as if it were a column like this
GET /users?select=full_name,*In fact the fake column is eligible for sorting too:
GET /users?select=full_name,*&order=full_name
Happy to submit a PR to better document this use case.
But before I do, is there any sense which of these two methods — 1) customizing a view, or 2) using a one-off function — is preferred?
Not sure if I understand you correctly. Do you want to filter the parents based on a criteria on the children? Or do you just want to filter the children?
In your first example, the syntax should be:
GET /parent?select=*,children(*)&children.order=length.desc&children.limit=10
More examples can be seen in embedded filters.
If you want to filter parents based on the children count, an option could be to create a computed column like:
-- projects=parent
-- tasks=children
CREATE or replace FUNCTION count_tasks(projects) RETURNS bigint AS $$
SELECT count(*) FROM tasks t where t.project_id = $1.id
$$ LANGUAGE SQL;
Then you can filter by this computed column:
GET /projects?select=name,count_tasks&count_tasks=eq.2
Yes— this was for filtering the parents, a la the computed column.
I'm going to close this issue since there is already documentation about filtering using a computed column, per your link.
Though computed columns are documented, this specific computed column usage for filtering the parents based on a criteria on the children isn't.
It'd be great if you could help us with adding this usage to the docs, or otherwise maybe just leave the issue open so I can have a look at it later :)