pg_net
pg_net copied to clipboard
Change table queue to in-memory queue and add callbacks
Reasons
- For all the requests to finish, an INSERT on the
_http_responsetable must be done, this reduces throughput. There are cases where the client doesn't care about the response so it doesn't need to be persisted. For example, function hooks don't do anything with the response. - As discussed in https://github.com/supabase/pg_net/pull/50#issuecomment-953796674, we don't have to ensure delivery and do retries since we're just an HTTP client, so we don't really need to persist the requests into
http_request_queue. - The
http_request_queuecan grow big with many requests and cause production issues(internal link). - Users create FKs to our private tables https://github.com/supabase/pg_net/issues/44, making the bg worker crash when trying to do the TTL.
Proposal
Drop the _http_response and http_request_queue tables and instead use an in-memory queue, plus add two callbacks:
create or replace function net.http_get(
url text,
-- ..,
on_success text default '',
on_error text default ''
)
returns void -- no id is returned now
as $$
-- ...
$$ language sql;
Which can be used like:
select net.http_get(
url := 'http://domain.com',
, on_success := $$ insert into my_table values ($1, $2, $3) $$ -- $1=status, $2=headers, $3=body
, on_error := $$ do $_$begin raise exception 'Failed request on %: "%"', $1, $2; end$_$; $$ -- $1=url, $2=error_message
);
Pros
- The callbacks are optional, so we don't have to start a transaction for each request if there are none provided. Throughput should be greatly increased from this.
- The
error_cbcan also be an insert/update on a table, so the request can be retried if needed.
@olirice @soedirgo WDYT?
Ohh, so that's what you meant by callbacks. Nice! Also like that this makes the impl a lot simpler and more akin to typical HTTP clients.
How would go about implementing the in-memory queue? IIRC when I looked into it passing info from the client process to the bgworker process was nontrivial.
that looks great!
Is it correct that all http status codes (e.g. including 500) would still go through the success_cb and the error_cb applies to stuff like timeouts?
returns void -- no id is returned now
I think it would still be good to return a uuid and pass that through to the callbacks i.e.
select net.http_get(
...
, success_cb := $$ -- $1=status, $2=headers, $3=body $4=request_id $$,
, error_cb := $$ -- $1=url, $2=error_message, $3=request_id $$
); -- result is uuid (request_id)
so if people are firing off a large number of requests, or want to associate a request with a response they'll have the tools they need
I know pg_cron does a similar inline sql approach but afaik they don't have to deal with passing parameters. Are we sure theres a good way to do that?
if not, a backup solution might be
create or replace function net.http_get(
-- ..,
success_cb regproc default null,
error_cb regproc default null
)
...as $$
-- ...
$$ language sql;
where the function signature matches the callback requirements
I know pg_cron does a similar inline sql approach but afaik they don't have to deal with passing parameters. Are we sure theres a good way to do that?
Should be possible with the USING clause, though we should probably document that the callbacks are limited to plpgsql command strings.
How would go about implementing the in-memory queue?
There's an example in postgres/src/test/modules/test_shm_mq for how to use the shared memory queue.
Is it correct that all http status codes (e.g. including 500) would still go through the success_cb and the error_cb applies to stuff like timeouts?
Yes, correct. Things like "unable to resolve host" as well.
so if people are firing off a large number of requests, or want to associate a request with a response they'll have the tools they need
I think we should leave that to be done externally, users can pass an id on the callback like:
select net.http_get(
-- url, headers, body..
, success_cb := format($$insert into responses values(%s, $1, $2, $3)$$, id);
);
And this id can come from a trigger on a table or maybe from a sequence used in a wrapper function. With that they can still associate a request/response if they want.
I know pg_cron does a similar inline sql approach but afaik they don't have to deal with passing parameters. Are we sure theres a good way to do that?
Yes, SPI_execute_with_args should take care of that.
I've also thought about supporting something like insert into tbl($status, $headers, $body) for better DX but that would require parsing and we'd loss perf. So I've just kept it simple at this stage.
Should be possible with the USING clause
cool! didn't know about that
I think we should leave that to be done externally, users can pass an id on the callback like:
I think we should leave that to be done externally, users can pass an id on the callback like:
select net.http_get( -- url, headers, body.. , success_cb := format($$insert into responses values(%s, $1, $2, $3)$$, id); );And this id can come from a trigger on a table or maybe from a sequence used in a wrapper function. With that they can still associate a request/response if they want.
ah nice, that works
or maybe from a sequence used in a wrapper function. With that they can still associate a request/response if they want.
good point
I'm thinking that moving to an in-memory queue would allow for http requests during postgrest GET requests as well?
During GET requests, postgrest uses a read-only transaction, during which INSERTs and other modifications are not allowed. I'm currently having troubles with making http requests as part of my policy for performing SELECTs on some tables due to this.
I'm thinking that moving to an in-memory queue would allow for http requests during postgrest GET requests as well?
Yes, since an http_get wouldn't need an INSERT on the net.http_request_queue anymore.
We would lose observability if we go this route, currently customers can do:
-- requests with fail/success HTTP status codes
select count(*), status_code from net._http_response group by status_code;
count | status_code
-------+-------------
448 | 429
821 | 500
1182 | 408
2567 | 200
-- rate limiting error
select content from net._http_response where status_code = 429 limit 1;
content
------------------------------------------------------------
<html><body><h1>429 Too Many Requests</h1> +
You have sent too many requests in a given amount of time.+
</body></html>
Which is great for debugging and finding out why a webhook receiver might be failing.
To increase throughput, we could make the net tables UNLOGGED, remove indexes and PKs from them(this would also solve https://github.com/supabase/pg_net/issues/44).
An alternative could be logging the failed http requests. Then the user would have to search the pg logs.
+1 for just logging errors, in general making it as stateless as possible should help with stability.
Yeah, agree. Also forgot about https://github.com/supabase/pg_net/issues/49, observability would be better with logging.
Any plans to implement this new feature in the near future?