pg_graphql
pg_graphql copied to clipboard
Feature Request: Subscriptions / Live Queries
Feature request
Is your feature request related to a problem? Please describe.
real-time all the things!
Describe the solution you'd like
Describe alternatives you've considered
Additional context
Subscription support is likely a ways off since postgres has no websocket support to piggyback on, but we do plan to explore it
We have some ideas from supabase/realtime to try when the time comes
SSE would be better than websockets I think.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue was closed because it has been inactive for 14 days since being marked as stale.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Subscriptions would be a great feature. One use case would be to have an "easy and performant" offline support solution for people who use supabase. With the help of rxdb and its GraphQL replication subscriptions option, avoiding the more costly option of pull-replication. The extension would have to resolve possible conflicts by initially implementing a simple updatedAt date comparison. Maybe, at a latter date, with the option to use a user defined function to solve conflicts.
This can be kinda achieved by using custom rest server which polls stored persistent queries as mentioned in #189 every few seconds and generates server-side-events or fetch response everytime data changes. This is actually very efficient since single polling loop can satisfy n subscriptions.
Interactions between queries and role + row level security make subscriptions difficult to implement and cache securely
supabase/realtime uses supabase/walrus to solve those interactions efficiently but it is tightly coupled. I'm in the process of refactoring walrus
to be re-usable by pg_graphql
. Things are headed in the right direction but it'll still be a while before subscriptions are supported by the platform
Could you provide an update on the status of the work on walrus and adding it to pg_graphql? Some kind of time horizon when you might provide subscriptions for pg_graphql?
sure thing,
The supabase/walrus work is on the linked branch. Currently supabase/realtime does a kind-of smart polling thing to query Postgres for new WAL records. The walrus worker branch extracts the DB polling and security stuff out of realtime and implements a background worker that listens for new WAL records on a logical replication slot (much faster & lighter). The filtering logic and column security logic was ported to rust where possible
That logic will start getting rolled out in 2-6 weeks once the realtime team finishes their transition from single to multi-tenant.
The worker was designed be reusable by pg_graphql. The output of the walrus worker is generic and we can plug different transport layers on top. We'll either write another transport layer and/or web socket sever to expose the functionality depending on how thing shake out.
A major reworking of pg_graphql is also about to drop. We ported it to use the pgx framework (rust). There's one failing test remaining and then they'll be a PR.
Psyched to get ^ out so we can get back to features!
Once that lands the feature priorities loosely are:
- add a global Id of some kind (easy)
- add a select 1 record by global id interface for each table
accountById(globalId: ID)
(easy) - (maybe) composite support
- (maybe) admin interface i.e. create tables/columns through GraphQL which are immediately reflected
- (maybe) an integration or two e.g. OneGraph
- an optional webserver to remove the dependency on PostgREST for exposing over HTTP
- connect background worker to ^ and expose subscriptions through webserver
So basically there's a ton going on and things are starting to align but it isn't imminent. Sometime next year is my best guess but I'd struggle to narrow it down past that.
@olirice lmk if you want to chat about this. You could implement the same logic as Logflare Endpoints.
Basically, subscribe to a query with an interval and and idle timeout. Where the interval would be the time between pushing results to clients, and the idle timeout would be when the process dies after the last request to said query.
So a client would connect, and listen to results of a query which would get updated every, say 5 seconds, and then the process would continue polling and caching these results for, say 5 minutes, so that if someone reconnected they'd get results very quickly again.
any updates? curious how it's going.
@aaronbond its still a ways out for now
we're rapidly getting to the point where request/response oriented pg_graphql core functionality is complete. That doesn't mean expanding the schema will stop, just that everything needed for performant applications is available. After that, realtime will get more attention.
A lot of thought is going into how to get subscriptions right for pg_graphql. Here's a summary of the ideas that have been floating around:
There are 2 options for "listening" for changes
WAL (write ahead log)
The postgres write ahead log is a stream of changes to the database that you can listen to. It is intended creating replica databases, for reporting etc. This is how supabase/realtime works. At face value, it seems perfect but there are a bunch of challenges with this method.
-
Security pg_graphql leans on postgres to apply role and row level security. Data is passed over the WAL in full, so we don't have a convenient way to detect who has access to what. You can (mostly) solve that for insert/update by querying back into the database "as the subscriber", which is what supabase/walrus does, but its a serious performance bottleneck, doesn't work for deletes, and is not transactionally consistent.
-
Unchanged TOAST TOAST is a postgres internals thing where large values are not stored directly in line with the source tuple. Since values in TOAST are large and inefficient to transmit of the network, part of the WAL spec is to omit TOASTed value is their value has not changed. There is no good solution to retrieve these values without querying back into the database for each row
Triggers
Option 2 is to develop a highly generic rust trigger that can apply security in the same transaction as the insert/update/delete. That would work for TOASTed values, and could send a message to some external process.
The risks with this approach are
-
Security Since RLS would be applied within the same transaction as the insert/update/delete, it needs to be hella performant or your transaction throughput will decrease as the number of subscribers increases.
-
Consistency If we're operating inside a trigger, we need to release the transaction ASAP. If the external process we're communicating with is temporarily unavailable, the choices are to hold the current transaction until it becomes available again (a bad option), or drop the message and continue (another bad option)
I think a workaround to that issue might be to write messages to a table in the graphql
schema and have the external process read from that table. That would always be available (no networking issues) and have ACID guarantees. I also like that option because it keeps everything contained within postgres and feels aligned with pg_graphql philosophically. The potential downside is that its a source of write amplification.
I have a good understanding of the WAL approach, but I'd like to explore the trigger option (specifically around performance) before making a decision.
next up we have
Transport
Neither Postgres or PostgREST have a solution for web sockets or server push so there's no way around having a separate process running to hold persistent connections.
That process could be truly standalone, or it could be a postgres background worker . Background workers sound great in theory but they're fiddly to deploy/debug and require the database to restart to upgrade (no more transactional upgrades).
For supabase to adopt either of those approaches the memory footprint would have to be small enough to comfortably fit on free-tier hardware with a high degree of scalability (at least hundreds of concurrent subscribers).
Lots to explore and think about. We'll get there but it going to take a while to get it right
@olirice May be following articles will give you some insight
https://neon.tech/blog/serverless-driver-for-postgres https://neon.tech/blog/quicker-serverless-postgres
@olirice is there any progress on this or is this likely not to happen?
@rlee1990 if this is something you need I'd recommend checking out realtime or an externally hosted GraphQL option
There are several paths forward for this feature but all of them have trade-offs and/or are heavy lifts. Since there are other features we still need for pg_graphql, this one is probably continue to get put on the back burner until the easier wins are all complete
The issue with realtime is that we are pulling multiple tables in at once in graphql but can't in real-time that I know of. If we could listen to views in real-time or some that would help but thanks for the update.