[RFC] @graphile/lds2
Feature description
The new package @graphile/lds2 is in essence a copy of the current implementation with two changes (1 & 2) and one addition (3):
-
Minimal required version of wal2json is at this time of writing 2.3, due to requiring the
include-pkparameter to be available. This is required for@graphile/subscriptions-ldsto properly work in cases whereREPLICA IDENTITY FULLis enabled on a table. -
Dropping support for
version-format 1, and moving toversion-format 2which no longer reports changes on a transaction level, but on a tuple level. Additional bonus is that in my opinion the json object is structured more like what the industry standard is today. -
Adding watch capability much to what I believe
graphile-buildhas for building it schema. Currently@graphile/ldsis not able to actively listen to schema changes, adding this would allow us to circumvent the need for theinclude-pkparameter and as such also keep supportingversion-format 1and wal2json <= 2.3, yet this is not the main goal of this addition.I believe the most useful purpose of adding a watch capability is to be able to support subscriptions on native pg >=11 partitioned tables and timescale hypertables. These table structures are providing a
virtualtable that one can doinsert,update,delete,selectupon, but in actuality these operations are redirected to one or more sub tables (partitions). wal2json does NOT report changes on this (virtual) parent table level, but on the actual subtable that the change occurred. This makes it impossible to subscribe to these type of tables with the current implementation of@graphile/lds. I am aware of the fact that native pg >= 11 partitioned tables are NOT very well supported inpostgraphileatm (Only the partitions are showing up in the schema, not the virtual table), but due to the different technical implementation of timescale hypertables these are actually coming out in the generated schema just fine (No partitions are shown, just the parent table).Having this watch capability allows us to figure out that when a change occurs in one of these sub tables, we also need to fabricate a same change for the (
virtual) parent table.
Motivating example
Motivation for this alternative version is twofold, being that the the current implementation of @graphile/lds:
-
Assumes that the primary key information of a change is always included in the property
oldkeysif the propertyoldkeysis included. This is NOT always the case due to the possibilty of enablingREPLICA IDENTITY FULLon a table which changes the contents ofoldkeysfrom not only containing the primary key(s), but to containing the entire record as it was before said change (much like theOLDvalue in update/delete triggers). As such, enabling this feature on a table breaks the current implementation of@graphile/lds.Fixing the above issue requires the use of a parameter
include-pkwhich is only available in version >=2.3 of wal2json, and as of right now we haven't been able to find a method to detect which version of wal2json is installed. The only way seems to be that oftry {} catch {}the code with the parameter and falling back on one without if it fails with a parameter unknown error. Problem I have with this approach is that we'd have to resort to string matching to determine the cause, which feels a bit too iffy to me. -
Is built on
version-format 1of wal2json which produces a JSON object per transaction. All of the new/old tuples are available in the JSON object. Whereas the newerversion-format 2produces a JSON object per tuple, supports thetruncateoperation and provides in my opinion a more logical constructed object.Below are both formats as defined in
typescriptdefinitions, these are both the default format which can be augmented with additional metadata based on parameters. One of which is the much neededinclude-pkwhich is only available in wal2json >= 2.3format-version 1:export interface Change { kind: Kind; schema: string; table: string; columnnames?: string[]; columntypes?: string[]; columnvalues?: Columnvalue[]; oldkeys?: Oldkeys; } enum Kind { Insert = 'insert', Update = 'update', Delete = 'delete' } export type Columnvalue = any; export interface Oldkeys { keynames: string[]; keytypes: string[]; keyvalues: Keyvalue[]; } export type Keyvalue = string;format-version 2:export interface Change { action: Action; schema: string; table: string; columns?: Column[]; identity?: Column[]; } enum Action { Insert = 'I', Update = 'U', Delete = 'D', Truncate = 'T' } export interface Column { name: string; type: string; value: Value; } export type Value = any;
As we discussed in Discord fixing/adding the above 2 points would result in such breaking changes that are whole new package is warranted. This new package @graphile/lds2 would have more strict requirements on which version of wal2json is used.
Breaking changes
None, because it is a new package. Yet, in order to make this package useful an update to @graphile/subscription-lds or alternative package that does the same but using @graphile/lds2 is required.
Supporting development
I [tick all that apply]:
- [x] am interested in building this feature myself
- [x] am interested in collaborating on building this feature
- [x] am willing to help testing this feature before it's released
- [x] am willing to write a test-driven test suite for this feature (before it exists)
- [ ] am a Graphile sponsor ❤️
- [ ] have an active support or consultancy contract with Graphile
I guess I'll get this party starting with an immediate comment on my own proposal. Having read everything meticulously again, I think there might be a way to prevent breaking changes by implementing point nr. 3 (watch capability) and using the additional schema information to retrieve the pk in case of REPLICA IDENTITY FULL.
I do have to say that I am not familiar with the technical implementation of this watching, so I am not sure how much (duplicate) effort it would be to incorporate this in either @graphile/lds or @graphile/lds2
Excellent outline :raised_hands:
Most of this week (except Thursday) is dedicated to OSS from me, so ping me on Discord if you want to chat about this in real time so we can try and make some headway. I'm on UK time (UTC+1).
Questions
Q1: Do we need include-pk?
I think you're saying that when you have REPLICA IDENTITY FULL, the entire row (rather than just the PKs) comes through via oldkeys. We could presumably detect this by seeing that the length of the data is !== primaryKeys.length and === columns.length and automatically pick the relevant values for the PKs. If this isn't the case, please could you include copies of the relevant payloads to make the differences clearer.
If we can fix this particular issue in @graphile/lds via this means, we should (since it's a bug), so perhaps we can accomplish this via a standalone PR to reduce the difference LDS2 introduces?
Q2: What specifically do you think we need to monitor for in watch mode?
The list of columns/PK obviously, but not sure what else?
We could/should add watch mode support to @graphile/lds anyway, so perhaps we can accomplish this via a standalone PR to reduce the difference LDS2 introduces?
Q3: What's involved in adding virtual table support; can we add this to @graphile/lds?
If we can add support in @graphile/lds in a non-breaking way, this seems optimal. Would be nice to extend this to PostGraphile core too; I think the challenge here is that the introspection query will need different versions for different versions of PostgreSQL.
For PostgreSQL native partitions, I think step1 would be adding native partition support to PostGraphile.
Other comments
C1: Platform support
It would be nice to have a table of wal2json support across the various cloud vendors: Amazon RDS, Google Cloud SQL, etc - what versions of Postgres does it support, what versions of wal2json within those does it support, etc. I'm not keen to require a version of wal2json that few cloud providers support, but if support now exists (even in only the latest PostgreSQL versions), I'm definitely open to considering it.
C2: wal2json version detection
Someone should reach out and find if there's a better approach to detecting the wal2json version. This would be optimal, versus having two modules.
Separately, I'm also uncomfortable with matching error message strings, but if they have specific error codes then this hesitation goes away. Is this something you could research?
C3: testing
It'd be really handy if we could test different combinations of wal2json and PostgreSQL, to see what comes out. This seems like a perfect use case for Docker; are you aware of a docker PostgreSQL-wal2json matrix we can leverage here, or do we need to roll our own?
C4: new message format
I definitely see the appeal. I'd be keen to address as many issues as we can in @graphile/lds before creating an official lds2 module; in an ideal world there'd be just one module and we'd auto-switch; the idea is that the LDS module abstracts the PostgreSQL internals and only tells us what we care about so it should be the right abstraction point for doing this.
Link to the previous Discord discussion:
https://discord.com/channels/489127045289476126/489717163280826374/756222484172243054
Excellent outline 🙌
Most of this week (except Thursday) is dedicated to OSS from me, so ping me on Discord if you want to chat about this in real time so we can try and make some headway. I'm on UK time (UTC+1).
Thanks! I'll try and see if I can make some time this week.
C2: wal2json version detection Someone should reach out and find if there's a better approach to detecting the wal2json version. This would be optimal, versus having two modules.
Separately, I'm also uncomfortable with matching error message strings, but if they have specific error codes then this hesitation goes away. Is this something you could research?
I already opened up an issue in the wal2json repo, but unfortunately there's no mechanism in place to detect versions from SQL. See https://github.com/eulerto/wal2json/issues/181
@eulerto Just pinging you to this discussion, if you have any useful ideas on how to tackle detecting the version of wal2json that'd be very helpful!
Q1: Do we need include-pk? I think you're saying that when you have REPLICA IDENTITY FULL, the entire row (rather than just the PKs) comes through via oldkeys. We could presumably detect this by seeing that the length of the data is !== primaryKeys.length and === columns.length and automatically pick the relevant values for the PKs. If this isn't the case, please could you include copies of the relevant payloads to make the differences clearer.
If we can fix this particular issue in @graphile/lds via this means, we should (since it's a bug), so perhaps we can accomplish this via a standalone PR to reduce the difference LDS2 introduces?
I think we can easily solve this using said watch capability in combination with the logic you're describing to detect that there's more in oldkeys than just the primary keys.
Q2: What specifically do you think we need to monitor for in watch mode? The list of columns/PK obviously, but not sure what else?
We could/should add watch mode support to @graphile/lds anyway, so perhaps we can accomplish this via a standalone PR to reduce the difference LDS2 introduces?
I would also argue that we make sure that we watch the relations of any partitioned tables, or other constructs (timescale) that look like it. This will allow us to correctly announce for the parent table as well for any child ones. I built a small fix for timescale once to make sure that a change in a chunk of a table would also trigger the table itself:
- hypertable A
- chunk A
- chunk B <-- change occurs here, then announcement references table chunk A, we also need to announce for hypertable A.
- chunk C
Above fix was quite trivial, because I could just register another callback on the system table that was keeping track of this relation between hypertables and their chunks. So on startup of the plugin I would gather all links, and afterwards place a callback to listen to any changes on this system table so I would stay up to date:
this.subscriptions = {
'["_timescaledb_catalog", "hypertable"]': [
[updateHyperTableChunks, undefined],
],
};
Q3: What's involved in adding virtual table support; can we add this to @graphile/lds? If we can add support in @graphile/lds in a non-breaking way, this seems optimal. Would be nice to extend this to PostGraphile core too; I think the challenge here is that the introspection query will need different versions for different versions of PostgreSQL.
For PostgreSQL native partitions, I think step1 would be adding native partition support to PostGraphile.
I believe it could be as simple as this:
this.subscriptions = {
'["pg_catalog", "pg_inherits"]': [
[updateTablePartitions, undefined],
],
};
The table pg_catalog.pg_inherits contains all information that we need.inhrelid is the column that contains the actual partition table name, and inhparent contains the schema that you could use in the predicate to filter more strictly.
Regarding Q3; are you suggesting that pg_catalog and pg_inherits are streamed via logical decoding too, and that we could subscribe to this? That's pretty cool! What I meant, though, is we should support these partitioned/hyper tables via PgTablesPlugin and the like.
Regarding Q3; are you suggesting that pg_catalog and pg_inherits are streamed via logical decoding too, and that we could subscribe to this? That's pretty cool! What I meant, though, is we should support these partitioned/hyper tables via PgTablesPlugin and the like.
Hmm, I just tried this locally but I couldn't listen on any pg_catalog.* table. It probably only works because the way timescale does it isn't regarded as being a system table.
@wesselvdv To keep this moving forward; can I suggest that you raise a PR just to solve the problem of Question 1; i.e. look at the length of oldKeys and use that to determine which fields are the PK.
In general, I'm thinking that we could/should use a flag rather than a separate version in order to support different wal2json features. This is slightly complicated by LDS being independent of Graphile Engine, but I think we can still achieve it :+1:
@benjie I agree, what's the best approach for incorporating the watch capabilities? I have yet to take a look at graphile-build to see how it's incorporated.
The LDS module is completely standalone to graphile-build (you can use it without caring about GraphQL at all), so it would have to implement its own watch mode. Rather than having watch mode, I think I would do just-in-time introspection of the tables. So when the first request comes through for a table, quickly look up it's primary keys, and cache it. Then all the changeToPk calls would pass through the details of the table as a second argument. A very simple watch mode would simply not cache this data at all ;)
@benji Wouldn't it be more useful to register events for ddl changes, and store these changes in a central audit table. Said audit table can then be listened upon using the in place callback system.
I just tested the below, seems to work nicely. I'll try to incorporate timescale hypertable detection in this aswel.
create schema graphile_lds_watch;
CREATE TABLE graphile_lds_watch.tables
(
"table" regclass primary key NOT NULL,
primary_keys TEXT[] NOT NULL DEFAULT '{}'::TEXT[],
parent REGCLASS NULL,
last_updated TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE OR REPLACE FUNCTION graphile_lds_watch.track_primary_keys()
RETURNS event_trigger AS
$$
DECLARE
r RECORD;
primary_keys TEXT[];
parent regclass;
pg_version numeric = current_setting('server_version_num');
BEGIN
FOR r IN SELECT * FROM pg_event_trigger_ddl_commands()
LOOP
IF r.command_tag = 'ALTER TABLE' OR r.command_tag = 'CREATE TABLE' then
primary_keys := (
SELECT array_agg(a.attname)
FROM pg_index i
JOIN pg_attribute a ON a.attrelid = i.indrelid
AND a.attnum = ANY (i.indkey)
WHERE i.indisprimary
AND i.indrelid = r.objid
);
IF pg_version >= 10000 THEN
parent := (
SELECT inhparent::regclass
FROM pg_catalog.pg_inherits
WHERE inhrelid = r.objid
);
end if;
INSERT
INTO graphile_lds_watch.tables("table", primary_keys, parent, last_updated)
VALUES (r.object_identity, coalesce(primary_keys, '{}'::TEXT[]), parent,
statement_timestamp())
ON CONFLICT ("table")
DO UPDATE SET primary_keys = excluded.primary_keys,
parent = excluded.parent;
end if;
END LOOP;
END ;
$$ LANGUAGE plpgsql;
CREATE EVENT TRIGGER track_ddl ON ddl_command_end
EXECUTE PROCEDURE graphile_lds_watch.track_primary_keys();
CREATE EVENT TRIGGER track_ddl_drops ON sql_drop
EXECUTE PROCEDURE graphile_lds_watch.track_primary_keys();
DROP TABLE IF EXISTS attributes;
CREATE TABLE IF NOT EXISTS attributes
(
id serial NOT NULL,
date TIMESTAMPTZ DEFAULT 'infinity'::TIMESTAMPTZ NOT NULL,
PRIMARY KEY (id, date)
);
ALTER TABLE attributes
DROP CONSTRAINT attributes_pkey;
ALTER TABLE attributes
ADD CONSTRAINT attributes_pkey_two PRIMARY KEY (id);
DROP TABLE IF EXISTS measurement;
CREATE TABLE IF NOT EXISTS measurement
(
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);
DROP TABLE IF EXISTS measurement_y2006m02;
CREATE TABLE IF NOT EXISTS measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
DROP TABLE IF EXISTS measurement_y2006m03;
CREATE TABLE IF NOT EXISTS measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
DROP TABLE IF EXISTS measurement_y2007m11;
CREATE TABLE IF NOT EXISTS measurement_y2007m11 PARTITION OF measurement
FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
SELECT * FROM graphile_lds_watch.tables;
I'm not sure I see the value of mutating the database on every mutation in the database to be able to track mutations in the database; could you expand? In PostGraphile the event triggers come through to Node which then checks the latest state of the DB; I'd expect something similar to work well here?
I'm not super comfortable with any tool mutating the database, there has to be a really good reason to do it. Event triggers in particular are frustrating because you can't easily omit them from pg_dumps so they pollute user's schema dumps.
@wesselvdv Are you still interested in working on this enhancement?
I'm looking to make some optimizations to the live-query system (as described here), and this enhancement appears to be a necessary step for entries 1 and 3 in that thread.
Specifically, they rely on having the full data of the changed row (for efficient matching against active live-queries), which requires REPLICA IDENTITY FULL to be enabled, which you've stated is incompatible with Postgraphile without this thread's proposed changes. (the main change being the upped version-requirement of wal2json, to support the include-pk parameter, which subscription-lds needs to be compatible with the full-replica flag, I gather)