postgrest Support proper bulk update via PATCH

Environment

PostgreSQL version: 11
PostgREST version: 8.0.0
Operating system: Linux

Description of issue

Schema:

drop table if exists example;
create table if not exists example (
	pk integer PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
	name   varchar(40) NOT NULL CHECK (name <> ''),
	value	varchar(40) NOT NULL CHECK (value <> '')
);

insert into example (name, value) values ('one', '1');
insert into example (name, value) values ('two', '2');
insert into example (name, value) values ('three', '3');

Postgrest usage:

#!/bin/bash

set -e

data='
    [
        {
            "pk": 1,
            "name": "example"
        },
        {
            "pk": 2,
            "name": "second"
        }
    ]
'

# returns all three
curl \
    -X GET "http://localhost:9000/example"

echo -e ""
echo -e "--"

# null value in column \"value\" violates not-null constraint
curl \
    -X POST "http://localhost:9000/example" \
    -H "Prefer: resolution=merge-duplicates" \
    -d "${data}"

echo -e ""
echo -e "--"

# null value in column \"value\" violates not-null constraint
curl \
    -X POST "http://localhost:9000/example?columns=name" \
    -H "Prefer: resolution=merge-duplicates" \
    -d "${data}"

Bulk update on partial set of columns triggers the NOT NULL constraint of the rest of the columns which are set as NOT NULL. Currently we relaxed the constraints in our database to fix that, and looked at doing this via a stored procedure, however it would be nice to have an official suggestion / fix for this.

Sep 21 '21 14:09 ghost

The reason is that you are not doing a bulk update at all. You are doing an INSERT. I guess you wanted to use PUT instead of POST to make it a INSERT ... ON CONFLICT DO UPDATE ..., but that would still yield the same error. The error happens at the INSERT stage, before PostgreSQL ever reaches the UPDATE part.

PUT is not a proper "bulk update".

Related: #1945.

Sep 22 '21 06:09 wolfgangwalther

Thanks for the clarification, makes perfect sense. Is a PATCH the HTTP verb that should be used for this? Should I close this issue and move the discussion over at the Q&A?

Sep 22 '21 11:09 ghost

Is a PATCH the HTTP verb that should be used for this?

PATCH is the proper way to do a real UPDATE. However, it does not have bulk capabilities.

Should I close this issue and move the discussion over at the Q&A?

Since we already have a Q&A item for that topic, I'd keep this issue. It highlights why PUT is not a full replacement for a bulk update, so we can track this feature request here.

Sep 22 '21 12:09 wolfgangwalther

This is currently an idea only. I can see the benefits of being able to do a bulk UPDATE - but I don't see how the REST API interface would look like, yet.

Maybe a query could somehow look like:

UPDATE table
SET column = pgrst_body.column
FROM (SELECT * FROM   json_populate_recordset(NULL::table, $1)) AS pgrst_body
WHERE table.pk_column = pgrst_body.pk_column

Sep 22 '21 12:09 wolfgangwalther

but I don't see how the REST API interface would look like, yet.

Actually... Do we currently support PATCHing with an array body? I don't think so. If not, this could be the non-breaking interface:

PATCH with json object as body: Current UPDATE with WHERE determined by querystring filters only.
PATCH with json array of objects as body: Bulk UPDATE with filters taken from PK columns + request body (as suggested above). Tbd: What happens with querystring filters in this case? Could be either ignored or added additionally.

Sep 22 '21 12:09 wolfgangwalther

Actually... Do we currently support PATCHing with an array body? I don't think so.

An array is accepted, but only the first element is taken into account for the UPDATE.

There was also another option for the query with an UPDATE..FROM, mentioned on https://github.com/PostgREST/postgrest/issues/1816

(Not sure which query would be best atm)

Sep 22 '21 22:09 steve-chavez

There was also another option for the query with an UPDATE..FROM, mentioned on #1816

(Not sure which query would be best atm)

Ah, thanks for linking that other issue, too. I think the queries are basically the same, except for when the json body is parsed. The proposal in #1816 looks like this would be done in haskell, while the one here is doing it on the database side. I think we generally prefer the latter and try to avoid the former, right?

Sep 26 '21 09:09 wolfgangwalther

Bulk UPDATE with filters taken from PK columns

This could work as a first iteration of this feature. If a different payload column is desired as the filter, perhaps we could reuse the on_conflict param. Or like requested on https://github.com/PostgREST/postgrest/issues/2123, perhaps the more clear on_constraint param?

Jun 10 '22 21:06 steve-chavez

Our query already mostly works for this, it's only a matter of when to apply the payload fields as filters.

WITH
pgrst_payload AS ( SELECT '[{"id": 4, "name": "xxx"}, {"id":5, "name": "yyy"}]'::json AS json_data),
pgrst_body AS ( SELECT CASE WHEN json_typeof(json_data) = 'array' THEN json_data ELSE json_build_array(json_data) END AS val FROM pgrst_payload)
UPDATE "test"."projects" SET "name" = x."name" FROM (SELECT * FROM json_populate_recordset (null::"test"."projects" , (SELECT val FROM pgrst_body) )) x
WHERE  "test"."projects"."id" = x.id RETURNING "test"."projects".*;

 id | name | client_id
----+------+-----------
  4 | xxx  |         2
  5 | yyy  |

Tbd: What happens with querystring filters in this case? Could be either ignored or added additionally.

We could add filters(id=gt.4):

WITH
pgrst_payload AS ( SELECT '[{"id": 4, "name": "xxx"}, {"id":5, "name": "yyy"}]'::json AS json_data),
pgrst_body AS ( SELECT CASE WHEN json_typeof(json_data) = 'array' THEN json_data ELSE json_build_array(json_data) END AS val FROM pgrst_payload)
UPDATE "test"."projects" SET "name" = x."name" FROM (SELECT * FROM json_populate_recordset (null::"test"."projects" , (SELECT val FROM pgrst_body) )) x
WHERE  "test"."projects"."id" = x.id AND "test"."projects"."id" > 4 RETURNING "test"."projects".*;

 id | name | client_id
----+------+-----------
  5 | yyy  |

But it doesn't help for knowing when to apply the payload filters. I mean we cannot always apply that condition. If we did it, it could cause confusing situations like:

-- the payload body has no id
WITH
pgrst_payload AS ( SELECT '[{"name": "xxx"}, {"name": "yyy"}]'::json AS json_data),
pgrst_body AS ( SELECT CASE WHEN json_typeof(json_data) = 'array' THEN json_data ELSE json_build_array(json_data) END AS val FROM pgrst_payload)
UPDATE "test"."projects" SET "name" = x."name" FROM (SELECT * FROM json_populate_recordset (null::"test"."projects" , (SELECT val FROM pgrst_body) )) x
WHERE  "test"."projects"."id" = x.id AND "test"."projects"."id" > 4 RETURNING "test"."projects".*;

 id | name | client_id
----+------+-----------
(0 rows)

-- Nothing gets updated

That would also break existing PATCH requests.

I think we should apply the payload filter whenever there are no query param filters. So:

PATCH /projects

[
  {"id": 5, "name": "xxx"},
  {"id": 6, "name": "yyy"}
]

Would generate the query with the id filter applied.

This would break the ability to PATCH the whole table in one request(deliberately or accidentaly), I'd argue that was always unwanted behavior(ref). So this:

PATCH /projects

{"name": "xxx"}

Will always result in 0 updates done(with no error from the db) and a 404 will be returned, same as doing an PATCH /projects?id=eq.non-existent request.

To make that operation possible, we can enforce a limit is present(taking advantage of https://github.com/PostgREST/postgrest/pull/2211):

PATCH /projects?limit=10&order=id

{"name": "xxx"}

Only 10 updates will happen.

Jun 11 '22 00:06 steve-chavez

Another good thing is that a full table UPDATE is an expensive operation(can take a long time on big tables) so by requiring a limit we prevent it from happening.

Jun 12 '22 22:06 steve-chavez

Reopened, now that is #2424 is merged.

Aug 12 '22 15:08 laurenceisla

So this was reopened because of the reasons mentioned on https://github.com/PostgREST/postgrest/pull/2414#issuecomment-1212647377. It basically broke existing functionality. There's an alternate implementation with underscore/dollar operators, but that's a more complicated feature.

~~How about making this simpler and we just do bulk update based on PKs when Prefer: params=multiple-objects(only used for RPC now) is specified?~~

~~Later on it could be extended to have Prefer: params=multiple-objects; unique_keys=id.~~

~~This syntax on the header also goes in line with the improvements proposed on https://github.com/PostgREST/postgrest/issues/2066#issue-1070332545~~

I believe a better syntax for this is now the one posted on https://github.com/PostgREST/postgrest/issues/465#issuecomment-1566188612.

Feb 13 '23 22:02 steve-chavez

Reusing the syntax from the discussion on https://github.com/PostgREST/postgrest/issues/465#issuecomment-1722382293, I think this feature should be implemented in the following way.

Having:

GET /items?id=lte.3

[
            { "id": 1, "name": "Item 1" }
          , { "id": 2, "name": "Item 2" }
          , { "id": 3, "name": "Item 3" }
]

Doing:

PATCH /items?id=lte.3&name=set.$body->name
Prefer: return=representation

[
            { "id": 1, "name": "Updated Item 1" }
          , { "id": 2, "name": "Updated Item 2" }
          , { "id": 3, "name": "Updated Item 3" }
]

Will result on:

[
            { "id": 1, "name": "Updated Item 1" }
          , { "id": 2, "name": "Updated Item 2" }
          , { "id": 3, "name": "Updated Item 3" }
]

After this, #465 should be easier to implement too.

Sep 17 '23 03:09 steve-chavez

@steve-chavez @laurenceisla Now that #2926 is merged, POST returns 200 on an update. So, maybe this can be done with POST. e.g

POST /items
Prefer: return=representation, resolution=merge-duplicates

[
            { "id": 1, "name": "Updated Item 1" }
          , { "id": 2, "name": "Updated Item 2" }
          , { "id": 3, "name": "Updated Item 3" }
]

HTTP/1.1 200 OK

[
            { "id": 1, "name": "Updated Item 1" }
          , { "id": 2, "name": "Updated Item 2" }
          , { "id": 3, "name": "Updated Item 3" }
]

Oct 27 '23 10:10 taimoorzaeem

postgrest postgrest copied to clipboard

Support proper bulk update via PATCH

Environment

Description of issue

postgrest
postgrest copied to clipboard