crystal icon indicating copy to clipboard operation
crystal copied to clipboard

Table inheritance / interfaces

Open EyMaddis opened this issue 7 years ago • 58 comments

Is there a way to model interfaces using PostgraphQl?

I guess that this would require to map any metadata to a table which could be a problem.


See potential solution: https://github.com/graphile/postgraphile/issues/387#issuecomment-567097800

EyMaddis avatar Mar 07 '17 10:03 EyMaddis

Not currently, but we could do something with Postgres table inheritance. What is your use case?

calebmer avatar Mar 07 '17 16:03 calebmer

I think the search example from this post is quite good: https://medium.com/the-graphqlhub/graphql-tour-interfaces-and-unions-7dd5be35de0d#.c4hzu9enw

As I understand it right now, I would need to write a postgres function which returns either a specific type of a table or declare a custom type which only contains the properties for that specific result set. The latter of which would lead to a lot of unnecessary types and does not allow the beautiful ... on MyType {... syntax in GraphQl, or am I wrong?

Table inheritance might be an option, but in some cases (like the search example) this would lead to tables which are created, but do not contain any data and would even generate GraphQL queries/mutations.

Maybe the generation could be avoided by placing the "abstract" tables into a schema that is not accessible to PostGraphQl

EyMaddis avatar Mar 07 '17 17:03 EyMaddis

The tables you inherit from you’d want to generate queries, but you wouldn’t want mutations. So just GRANT SELECT, but not inserts, updates, or deletes 😉

If we were to do this we would need to find a way to represent a union or an interface in Postgres. I’m not sure how that would be done.

The way I generally design my schema is I have a single table which has a type enum and then has some fields which are specific to that type. So if an option for type on a search table were person then I may also have a column called person_id.

calebmer avatar Mar 07 '17 22:03 calebmer

Maybe something like:

create type postgraphql_union__search_result as (
  "user" "user",
  post post,
  comment comment,
  book book,
  topic topic,
  ...
); -- Exactly one of these should be not-null at a time

create function search_results() returns setof postgraphql_union__search_result ...

?

benjie avatar Mar 08 '17 12:03 benjie

I think with that approach the make clear which types the union combines. Suppose you have two tables: Person and Organization a union should look something like this: postgraphql_union__Person__Organization.

EyMaddis avatar Mar 08 '17 15:03 EyMaddis

I think @benjie’s proposal would be a fine thing to explore if anyone is interested in building out the implementation. That was what I imagined unions would probably have to look like.

calebmer avatar Mar 08 '17 15:03 calebmer

@EyMaddis The reason I didn't do that is because I wanted you to be able to extend the union later with the minimum of effort.

benjie avatar Mar 08 '17 15:03 benjie

But how can postgraphql know which type the union combines?

EyMaddis avatar Mar 08 '17 15:03 EyMaddis

@EyMaddis we can introspect the attributes on the type 😊

Every attribute represents a different type in the union. PostGraphQL would then only resolve the first non-null attribute it finds expecting all of the other attributes to be null.

calebmer avatar Mar 08 '17 16:03 calebmer

@calebmer ah, alright, got it. I did not pay close attention to the types of the attributes.

I think I like that, but it would be kind of awkward when writing the functions, wouldn't it? Probably something like a switch-case where the type defines which attribute to put the data to.

something similar could be applied to interfaces, but postgraphql could let tables "inherit" an interface if the properties defined in the type matches the columns. Not saying that this is the best idea, just a possibility.

EyMaddis avatar Mar 08 '17 16:03 EyMaddis

I can currently somehow achieve what I want through inheritance into another schema. There are two schemas, one for internal functions and my interface tables and my public schema that PGQL has access to.

I created a timestamped interface that has info like created_date and a lot of tables inherit from that interface (there are also triggers for these tables). Due to the fact that the timestamped table is in my internal schema, PGQL will not generate the default mutations/queries, but I inherit all my tables. The only missing part is that PGQL checks for inheritance and recognizes timestamped as an interface.

Edit: by the way, multi-inheritance is possible

EyMaddis avatar Apr 07 '17 12:04 EyMaddis

Could you put together a quick demo so we can see what this looks like in practice? This is an interesting idea using table inheritance with an empty table.

calebmer avatar Apr 10 '17 18:04 calebmer

@calebmer what do you require exactly? Some SQL commands to set the tables as described above?

CREATE SCHEMA interfaces; -- no access for PostGraphQL
CREATE SCHEMA postgraphql; -- access granted to PostGraphQL


CREATE TABLE interfaces.timestamped ( 
	-- no id, because it would increase the serial with every document created by a child
    "created_date" date DEFAULT now()
);

CREATE TABLE interfaces.other_interface ( 
    "something_inherited" text
);

CREATE TABLE postgraphql.something(
    "id" serial,
    "something_specific" text,
    PRIMARY KEY ("id")
) INHERITS (interfaces.timestamped, interfaces.other_interface);

EyMaddis avatar Apr 10 '17 23:04 EyMaddis

Do we have any updates on this, or if anyone's working on it? We're thinking about picking this up, interested in what people have tried or run into.

justinmakaila avatar Jul 28 '17 18:07 justinmakaila

I would love to see this feature, so I'm willing to try to implement this in a plugin. It's a bit scary though, the plugins api is not easy to grasp with. I would love some baby steps directions to start this right.

The first thing I want to do is adding a hard coded union type. I guess I have to use newWithHooks(), but looking at other plugins does not help.

jide avatar Jul 25 '18 10:07 jide

...or is this not even feasible in a plugin because there is no GraphQLUnionType implemented here ? https://github.com/graphile/graphile-build/blob/master/packages/graphile-build/src/SchemaBuilder.js#L159

Yes i'm kind of lost :)

jide avatar Jul 25 '18 11:07 jide

I have plans to add this to core, but they're not far along. My plan is:

For every table foo that is inherited by another table (create table bar (...) inherits (foo);, create an interface type named after the table FooInterface. Then the table type would implement this interface type Foo implements FooInterface and all children below there implement the interface also type Bar implements FooInterface. Effectively you'd need a plugin that looks through the classes and defines these interfaces. Then you need a plugin that hooks every time a table type is defined and adds an interfaces thunk (ref) returning the interface. That should be enough to get the schema to introspect as supporting the interfaces.

However the complexity comes in the look-ahead logic. Currently it tries to resolve concretely which type will be returned ahead of time so it knows which SQL fields to request, this is going to need some thought. I've not got this far yet. I'm sure there's many other complexities I've not thought about yet as well.

Since I'm not really sure what shape this solution should take yet, I don't think I can advise you how best to implement it.

As for newWithHook it's really just an indirect way of calling the GraphQL.js constructors, documented here: https://graphql.org/graphql-js/type/

e.g. in graphql.js you'd do

const MyInterfaceType = new GraphQLInterfaceType({
  name: "MyInterface",
  fields: () => ({id: {type: GraphQLString } }),
  description: "My interface type",
};

In graphile you'd do:

const MyInterfaceType = newWithHooks(GraphQLInterfaceType, {
  name: "MyInterface",
  fields: () => ({id: {type: GraphQLString } }),
  description: "My interface type",
}, {
  // Optionally add some things here so other plugins know what this type
  // is all about and whether or not they should extend it.
});

The hooks themselves just take the object spec:

                                                           {
  name: "MyInterface",
  fields: () => ({id: {type: GraphQLString } }),
  description: "My interface type",
}

and return a new version of the object spec:

{
  ...oldObjectSpec,
  description: "My much better description",
}

benjie avatar Jul 25 '18 11:07 benjie

not even feasible in a plugin because there is no GraphQLUnionType implemented here

That means you can't use newWithHooks on GraphQLUnionType; but you can build a raw one from graphql.js and use that. However, the issue here is you can't register the fields with the look-ahead mechanic.

A PR adding GraphQLUnionType support would be welcome; but it'd have to come with a lot of tests!

benjie avatar Jul 25 '18 11:07 benjie

Thank you for the detailed response @benjie.

To be honest, this seems like a task that needs a deep knowledge of the API, and you seem to have thought this through a lot, so I guess it's hopeless for me to implement this the right way. Maybe I'll try to create simpler plugins first.

So... let us know when there is progress with this, if there is :) It's a feature every graphql->db lacks from what I've seen, and whatever lib i try I'm stuck with this problem in my use case.

jide avatar Jul 25 '18 13:07 jide

Did not see your last comment. That could be a reasonable PR, but I'm not even sure. Maybe I'll give it a shot after having read the code a little more deeply... maybe not... I'll let you know how I feel about this.

jide avatar Jul 25 '18 13:07 jide

Yeah; it's definitely a non-trivial problem. I'll update this thread when progress is made, but this is not currently high on my TODO list.

benjie avatar Jul 25 '18 13:07 benjie

Having support for union types in general, as in this example, would be great. I'm not sure I have any further insights as to how you'd best express it in postgresql, but I'm just voicing my interest :-)

Munksgaard avatar Nov 07 '18 10:11 Munksgaard

Here's what I recently experienced:

I have the following schema:

create schema test;

create table test.user (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL
);

create table test.file (
  id SERIAL PRIMARY KEY,
  filename TEXT NOT NULL
);

create table test.user_file (
  user_id INTEGER NOT NULL REFERENCES test.user(id)
) inherits (test.file);

I then try to run a query:

mutation {
  createUserFile(input: {userFile: {filename: "foo.txt", userId: 1}}) {
    userFile {
      id
      filename
      userByUserId {
        id
        name
      }
    }
  }
}

I get the following result:

{
  "errors": [
    {
      "message": "Cannot read property 'keyAttributeNums' of undefined",
      "locations": [
        {
          "line": 2,
          "column": 3
        }
      ],
      "path": [
        "createUserFile"
      ]
    }
  ],
  "data": {
    "createUserFile": null
  }
}

For this simple case, I sort of expected it to work.

Munksgaard avatar Nov 28 '18 12:11 Munksgaard

@Munksgaard inheritance isn't integrated with unique constraints, including primary keys. So, test.user_file has no primary keys, which could be the underlying cause. I don't have this issue using a great deal amounts of inheritance, but all my relations have primary keys

micimize avatar Nov 28 '18 17:11 micimize

@micimize Good catch! I'll try it out, but I suppose you're right. Out of curiosity, how would you treat the case above? Would a simple create unique index on test.user_file (id) work, or does it need to be declared as a primary key?

Munksgaard avatar Nov 28 '18 17:11 Munksgaard

@Munksgaard I think this should work - it's roughly what I do:

create table test.user_file (
  user_id INTEGER NOT NULL REFERENCES test.user(id),
  primary key (id),
) inherits (test.file);

You might also consider just using a through table like user_file ( user_id, file_id )

micimize avatar Nov 29 '18 16:11 micimize

A different idea, next to table inheritance, could be typed tables.

To support multiple inheritance, I guess we would need an alternative approach. I guess a function that allows a cast, e.g. bar_as_foo_interface :: bar -> foo, could do this? Or maybe even a view to enable indexing some computed columns?

ab-pm avatar Aug 22 '19 13:08 ab-pm

OK, I'm committed to implement this, as I need it for my current project. I think I figured out enough of the plugin system to be able to generate the schema in the shape I want, but I'm stuck on the lookahead logic as I don't really understand how the QueryBuilder works.

My plan is to support two different kinds of database schema, which seem common:

CREATE TABLE theInterface (
   id SERIAL, -- the DEFAULT value is shared, a common sequence
   shared TEXT
); -- no values are expected to be stored in theInterface table itself
CREATE TABLE oneType (
    id INTEGER PRIMARY KEY,
    foo INTEGER
) INHERITS (theInterface);
CREATE TABLE anotherType (
    id INTEGER PRIMARY KEY,
    bar INTEGER
) INHERITS (theInterface);
CREATE TABLE theInterface (
    id INTEGER GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    shared TEXT
);
CREATE TABLE oneType (
    id INTEGER REFERENCES theInterface,
    foo INTEGER
);
COMMENT ON CONSTRAINT oneType_id_fkey ON oneType IS '@inherit';
CREATE TABLE anotherType (
    id INTEGER REFERENCES theInterface,
    bar INTEGER
);
COMMENT ON CONSTRAINT anotherType_id_fkey ON anotherType IS '@inherit';

This should be enough to generate a schema, and most queries on these types should work out of the box.

The problem appears when a field on an unrelated type wants to reference theInterface. This is two-fold:

  • We might want to generate a global allTheInterfaces or theInterfaceById query field. We would select from theInterface table (which in the first schema, views all the rows from the inherited tables, and in the second schema contains all the shared data of the subtypes directly), but then we would need to join(?) that with the tables from query fragments and conditionally select subfields. Or rather do subqueries? Or start with a union of the requested subtype tables right away? To do the condition, we could either simply join on the id (which I made unique across all subtype objects), or make a select depending on the type of the row. To get the type of the object one can select the tableid system column, but it probably(?) is more efficient if the database had stored the subtype in an extra (not schema-exposed) column - in the second schema, just on theInterface table, in the first schema it would need an extra id->type relation, each of them updated appropriately with triggers.
  • We have another table that references an object of the interface type. This is basically a variant type/discriminated union/tagged union. (Some reading). In the second schema, this would become a column with a foreign key that REFERENCES theInterface (id). In the first schema, we would now definitely need the id->type relation containing all the object ids, as otherwise we cannot create a database constraint. (Alternatively, we could go with no database constraint and use a smart comment fake constraint instead, but shudder.) This referenced table would need an extra annotation @interface theInterface so that the introspection can "redirect" the type and use a special resolver. Again, we can select from the shared columns easily, but need a type resolver for everything else. I'm thinking of having the dba provide a computed-column-like function for that, given either through convention (theInterface_resolveType, theInterface_typeName?) or an annotation on the interface table. Postgraphile would just call this when it needs the type (always?), and be independent of the particular implementation. I do wonder though what the function should return: a custom enum (most efficient), a tableoid (most simple and compatible), a graphql type name (as text)?

This does in fact also provide exactly the same capability as union types would need. For them, we would just have the id->type relation, annotated with @union oneType,anotherType, and referenced by foreign key constraints. The only conceptual difference to the second schema would be that the table doesn't contain any shared columns.

If we follow this distinction of models, I can't think of anything better than the "single table inheritance" format for querying unknown types. Maybe another customisation (optimisation?) option would be to have a user-defined VIEW that follows this format, and simply query that (annotated as belonging to the interface, to be a drop-in type for foreign keys that reference the object id). @benjie How would this format work with nested queries / aggregated values from subqueries? Can you help me understand and/or show me how to use the QueryBuilder to achieve this? (I guess I should look at how queries are build in the forward/backward relation plugins).

ab-pm avatar Sep 13 '19 14:09 ab-pm

So I think I'd have to dedicate a fair amout of time to fully processing what you just wrote. I was planning to do Rails-style polymorphism whereby you have a type and id column to describe which table it links to and what the record id is, but have not had time to get around to it. If we can solve interfaces then unions like the Rails-style should be easy (since it's effectively an interface with no fields).

When it comes to actually implementing the lookahead, it comes down to getDataFromParsedResolveInforFragment's second argument.

Here's an example of where we build a subquery for going backwards in a relation (e.g. given a user, find all the posts they've written: postsByAuthorId):

https://github.com/graphile/graphile-engine/blob/d723558b769777c4a9e572d502f1f54844207138/packages/graphile-build-pg/src/plugins/PgBackwardRelationPlugin.js#L276-L359

Now that's pretty intimidating, but if we snip out the queryFromResolveData body (which we don't actually care about for this) it comes down to this:

queryBuilder.select(() => {
  const resolveData = getDataFromParsedResolveInfoFragment(
    parsedResolveInfoFragment,
    isConnection ? ConnectionType : TableType
  );
  const query = queryFromResolveData(/*...*/);
  return sql.fragment`(${query})`;
}, getSafeAliasFromAlias(parsedResolveInfoFragment.alias));

Let me explain these parts a bit at a time; I'm going to go into more detail than necessary in a few places because I'd like to put this into the docs at some point.

queryBuilder.select(fragmentGenerator, alias)

This adds a field to the SELECT statement; we call fragmentGenerator(), get the result (a pg-sql2 fragment) and we insert that into our query. Now it's important to note that rather than doing SELECT fragment AS alias we actually build JSON in the database, so this ends up looking like:

SELECT
  json_build_object(
    ${sql.literal(alias)}, ${fragmentGenerator()},
    -- ...
  )
FROM
  ...

pg-sql2: https://www.npmjs.com/package/pg-sql2

getDataFromParsedResolveInfoFragment(parsedResolveInfoFragment, graphqlType)

This is defined in makeNewBuild; but the main thing we care about is that it calls simplifyParsedResolveInfoFragmentWithType. This is vital.

This comes from graphql-parse-resolve-info which is responsible for looking through the GraphQL AST and turning it into a list of fields/arguments that are required.

Now, since GraphQL supports interfaces/unions, this isn't always easy since the fields request can depend on the type that the data ends up being. In GraphQL you specify fragments like:

{
  node(id: ...) {
    ... on User { username }
    ... on Post { title }
  }
}

So what fields are requested on node? If it's a User then username; if it's a Post then title. graphql-parse-resolve-info solves this by doing a two-layer map, where the first layer is the possible types that the field might resolve to.

Almost everywhere in PostGraphile, however, we know what the type will be ahead of time, so we can call simplifyParsedResolveInfoFragmentWithType which gets rid of that intermediate layer. This has been what we want, but isn't what we want any more... But let me continue

queryFromResolveData(...)

This will return a SQL query based on the look-ahead data we've extracted from the GraphQL tree using the above utilities. This is things like "if this field is requested, we need you to select this column" and "if this argument is provided then we need to order by this". We don't need to worry too much about these concerns, I think.

return sql.fragment(${query});

This vital statement is what creates our subquery, so now we have something like:

SELECT
  json_build_object(
    'my_list', -- THIS IS `alias`
    ( -- THIS IS THE BRACKET
      -- INSIDE THE BRACKETS IS `${query}`
      SELECT 
        json_build_object(
          'a', a, 
          'b', b,
          'c', c
        )
      FROM bar
      WHERE foo_id = foo.id
      -- END OF `${query}`
    ) -- THIS IS THE CLOSE BRACKET
  )
FROM foo
WHERE foo.id = 1

This is fine when we know what the type is, and has got us this far, but now you want to go and make things more complex...

So, unions and interfaces

I think we want to do what we did before, EXCEPT, we need to do it for each possible type, and then combine them together with a CASE statement

queryBuilder.select(() => {
  const possibleGraphQLTypes = [ ... ];
  const polymorphicTableAlias = Symbol(); // So things can reference us
  const fragments = possibleGraphQLTypes.map(gqlType => {
    const resolveData = getDataFromParsedResolveInfoFragment(
      parsedResolveInfoFragment,
      gqlType
    );
    const query = queryFromResolveData(/*... BUT referencing `polymorphicTableAlias`, and probably tweaking the options so it only generates the SELECT or something... dunno ...*/);
    const condition = /* How do we know, in the database, if the record is of type `gqlType`? */ sql.fragment`${sql.identifier(polymorphicTableAlias, 'type')} = ${sql.value(gqlType.name)}`;
    return sql.fragment`WHEN ${condition} THEN (${query})`;
  });
  return sql.fragment`(
    SELECT (
      CASE
      ${sql.join(fragments, ' ')}
      ELSE null -- Maybe?
      END
    ) FROM wherever AS ${sql.identifier(polymorphicTableAlias)}
  )`;
}, getSafeAliasFromAlias(parsedResolveInfoFragment.alias));

So our SQL would come out as something like:

SELECT
  json_build_object(
    'my_list',



    (
      SELECT (
        CASE
        WHEN bar.type = 'User'
        THEN (SELECT json_build_object('username', users.username) FROM users WHERE users.id = bar.id)
        WHEN foo.type = 'Post'
        THEN (SELECT json_build_object('title', posts.title) FROM posts WHERE posts.id = bar.id)
        ELSE null
        END
      )
      FROM bar
      WHERE foo_id = foo.id
      -- END OF `${query}`
    ) -- THIS IS THE CLOSE BRACKET



  )
FROM foo
WHERE foo.id = 1

And I've spent about an hour writing this post it seems, so that'll have to be it for today. Sorry for the abrupt ending and lack of editing! If I had more time I could definitely make this make more sense...

:sparkles:

benjie avatar Sep 13 '19 15:09 benjie

I want to take a minute and appreciate the work you put into this. Not only the effort for the library, but also in the amount and detail of your communication.

Thanks Benjie.

(Dear Reader, see Patreon and Github Sponsoring)

EyMaddis avatar Sep 13 '19 19:09 EyMaddis

Thanks @benjie for the prompt response, and the wonderfully appropriate level of detail! I spend hours on my posts as well, no worries. Here's another gush of ideas to process :-)

First of all, thanks for the SQL examples, I think I now understand better how the lookahead queries work: it's all correlated subqueries, nested in the SELECT list.

I was planning to do Rails-style polymorphism whereby you have a type and id column to describe which table it links to and what the record id is, but have not had time to get around to it.

I'm not quite certain how Rails polymorphism works on the database layer. Do you mean that the referencing table would have two columns type and id, i.e.

CREATE TABLE example (
    … -- whatever columns the example type has
    ref_type TEXT CHECK (ref_type IN ('a', 'b', 'c')),
    ref_id INTEGER,
    CHECK (ref_id IS NULL = ref_type IS NULL) -- or just make both NOT NULL
)

were the ref_id references the primary key of either the a, b or c table, depending on the ref_type? I don't quite like that, as it doesn't allow for a foreign key constraint. But if you say this is a common pattern, I guess we'd have to support it. (Quick idea: a "computed column" function example_ref_table that by annotation or convention belongs to ref_id and returns a TABLEOID of the referenced table, possibly using a DOMAIN for the possible values and to give the union a name?)

A workaround might be to have multiple columns where only one stores the reference:

CREATE TABLE example (
    … -- whatever columns the example type has
    ref_a_id INTEGER REFERENCES a (id),
    ref_b_id INTEGER REFERENCES b (id),
    ref_c_id INTEGER REFERENCES c (id),
    CHECK(num_nonnulls(ref_a_id, ref_b_id, ref_c_id) <= 1) -- or = 1
)

But how would Postgraphile know that it should generate

union ABC = A | B | C
type Example {
    …
    ref: ABC
}

Would need some fancy annotations I guess.

My idea, at least for interfaces, was to have a shared primary key (id) for all subtypes that uniquely identifies the row across all tables/types that implement the interface. (See my theInterface example above). That would need only a single

    ref_id INTEGER REFERENCES abc (id)

column in the example table, a @union or @interface annotation on the abc table, and database constraints to have each id of a, b or c exist in abc. Especially for interfaces (that share some columns in abc) this would be my preferred modelling, if not using table inheritance where the shared columns are repeated in each subtype.

If we can solve interfaces then unions like the Rails-style should be easy, since it's effectively an interface with no fields.

Ack. Except that a shared id pool makes more sense for interfaces than for unions.

Regardless what style we use for references to an interface or union type, for interfaces there's another case with necessary subtype resolution that I missed above: the backward relation of a shared field. The solution would be very similar to the allTheInterfaces problem though. (So, not an unsolvable problem, I just wanted to mention the case).

Query approaches

Now back to how the generated queries should look. I figured there are different ideas:

1. Repeating the query.

SELECT json_build_object(
  'foo_x', foo.x,
  'foo_bar', (CASE bar_resolveType(foo.bar_id)
    WHEN 'user'::regclass::oid THEN (
      SELECT json_build_object('__typename', 'User'::text, 'shared', bar.shared, 'username', users.username)
      FROM bar JOIN users ON bar.id = users.id
      WHERE bar.id = foo.bar_id)
    WHEN 'posts'::regclass::oid THEN (
      SELECT json_build_object('__typename', 'Blog'::text, 'shared', bar.shared, 'title', posts.title)
      FROM bar JOIN users ON bar.id = posts.id
      WHERE bar.id = foo.bar_id)
    ELSE pg_raise('no concrete type found')
  END CASE)
)
FROM foo

It's rather simplistic, and has the same nesting of subqueries as an ordinary field. The FROM bar and WHERE bar.id = foo.bar_id parts are repeated in each of them. I have no idea how the query optimiser will deal with this. Probably not a good idea when using bar.shared fields. On the other hand, this could work well enough with unions (where there are no shared fields on the bar table) or inherited tables (where all the shared fields are in stored in the respective subtable), and we would not need to JOIN bar to the subtable at all but could completely omit bar. This leaves lots of freedom to bar_resolveType then.

For unions without shared ids, and without any bar table that's used for resolving the type, this would be the way to go.

2. "single table" format

SELECT json_build_object(
  'foo_x', foo.x,
  'foo_bar', (SELECT
    (CASE bar_resolveType(bar)
      WHEN 'user'::regclass::oid THEN json_build_object('__typename', 'User'::text, 'shared', bar.shared, 'username', users.username)
      WHEN 'posts'::regclass::oid THEN json_build_object('__typename', 'Blog'::text, 'shared', bar.shared, 'title', posts.title)
      ELSE pg_raise('no concrete type found')
    END CASE)
  FROM bar LEFT JOIN users USING (id, shared) LEFT JOIN posts  USING (id, shared)
  WHERE bar.id = foo.bar_id)
)
FROM foo

The other extreme is to have only one SELECT statement that gets all the values at once, with null values where the subtype is not met. This can be achieved using a lot of LEFT JOINs against bar, but possibly also something with a UNION ALL (which might perform better since we know the ids are disjoint). By optionally not doing the merge ourselves but letting the user supply a view, this could give a lot of freedom:

  SELECT
    (CASE bar_view.__typename
      WHEN 'User' THEN json_build_object('__typename', 'User'::text, 'shared', bar_view.shared, 'username', bar_view.username)
      WHEN 'Blog' THEN json_build_object('__typename', 'Blog'::text, 'shared', bar_view.shared, 'title', bar_view.title)
      ELSE pg_raise('no concrete type found')
    END CASE)
  FROM bar_view

For inherited tables, the parent type table already provides a view that's very similar to this, and we could easily use the system tableoid column for resolving the typename. For queries that only select shared fields on a table inheritance schema, this seems like the optimal approach.

3. nested subqueries

SELECT json_build_object(
  'foo_x', foo.x,
  'foo_bar', (SELECT
    (CASE bar_resolveType(bar)
      WHEN 'user'::regclass::oid THEN (
        SELECT json_build_object('__typename', 'User'::text, 'shared', bar.shared, 'username', users.username)
        FROM users WHERE users.id = bar.id)
      WHEN 'posts'::regclass::oid THEN (
        SELECT json_build_object('__typename', 'Blog'::text, 'shared', bar.shared, 'title', posts.title)
        FROM posts WHERE posts.id = bar.id)
      ELSE pg_raise('no concrete type found')
    END CASE)
  FROM bar
  WHERE bar.id = foo.bar_id)
)
FROM foo

This appears to be basically the approach that you ended up with as well. The differences are minor: I used a bar_resolveType function to be supplied by the user, instead of relying on a hardcoded bar.type column, I added a selection of a shared field (stored in the bar table), and I was presuming a forward relation in my example (from foo.bar_id to bar, not the other way round).

I believe this approach also is the best (easiest) for handling nested hierarchies, where bar_resolveType would return another interface instead of a concrete type - it would simply need to be resolved recursively in the inner SELECT clause, and the bar fields simply stay in scope. (GraphQL schema cannot represent this yet, but we should support that use case nonetheless).

One minor quibble I have though is that shared fields are repeated in the json_build_object calls. This might be fine for data directly generated from columns, but probably not for fields containing more complex stuff like subqueries. One attempt to fix this would be putting the shared fields in an extra object outside the CASE statement and merge it with the type-dependent fields:

  SELECT
    json_build_object('shared', bar.shared) || (CASE bar_resolveType(bar)
      WHEN 'user' THEN (
        SELECT json_build_object('__typename', 'User'::text, 'username', users.username)
        FROM users WHERE users.id = bar.id)
      WHEN 'blog' THEN (
        SELECT json_build_object('__typename', 'Blog'::text, 'title', posts.title)
        FROM posts WHERE posts.id = bar.id)
      ELSE pg_raise('no concrete type found')
    END CASE)
  FROM bar
  WHERE bar.id = foo.bar_id

(unfortunately this requires a cast to jsonb to actually work). Another approach could be some kind of temporary variables, introduced e.g. by a lateral join:

  SELECT
    (CASE bar_resolveType(bar)
      WHEN 'user' THEN (
        SELECT json_build_object('__typename', 'User'::text, 'shared', bar_shared, 'username', users.username)
        FROM users WHERE users.id = bar.id)
      WHEN 'blog' THEN (
        SELECT json_build_object('__typename', 'Blog'::text, 'shared', bar_shared, 'title', posts.title)
        FROM posts WHERE posts.id = bar.id)
      ELSE pg_raise('no concrete type found')
    END CASE)
  FROM bar, LATERAL (SELECT bar.shared AS bar_shared)
  WHERE bar.id = foo.bar_id

Regarding the pg_raise('no concrete type found') (using some pg_raise function), we would use this only when the case statement match is exhaustive on all possible types that implement the interface / are contained in the union. I'm not sure what graphql-parse-resolve-info does here, will it consider all concrete types or only those that are necessary for the particular query (resolveInfo)? Otherwise, we would have to return

    json_build_object('__typename', null, 'shared', bar.shared)

Only the shared fields of the interface are fetched, the exact type doesn't matter. Well, at least not for the result, but I guess graphql's __resolveType will care about it, so that it can use the appropriate resolvers for the shared field - which might be different on the distinct types (but aren't in our case, they're all trivial resolvers). We'd need to spell out the bar_resolveType(bar) instead of null.

I would also like to raise an exception if there was an entry for an id in the shared bar table, but no row in the respective subtype table - basically, if the implicit foreign key constraint was violated. I'm not certain how to tell the JOIN that it must find something though. I guess in the nested subqueries we could wrap some COALESCE(…, pg_raise('no concrete value found')) around the CASE statement (inside the SELECT … FROM bar), in the case of a join we could select COALESCE(users.tableoid, posts.tableoid, pg_raise('no concrete value found')) - maybe even as the default bar_resolveType.

ab-pm avatar Sep 16 '19 15:09 ab-pm

You're welcome :) I'm hoping you'll write up some learnings into the docs or a blog post or something that I can point future people at ;)

I'm not quite certain how Rails polymorphism works on the database layer. Do you mean that the referencing table would have two columns type and id, i.e. [...]

Yes

(Quick idea: a "computed column" function example_ref_table that by annotation or convention belongs to ref_id and returns a TABLEOID of the referenced table, possibly using a DOMAIN for the possible values and to give the union a name?)

Interesting idea.

A workaround might be to have multiple columns where only one stores the reference:

CREATE TABLE example (
    … -- whatever columns the example type has
    ref_a_id INTEGER REFERENCES a (id),
    ref_b_id INTEGER REFERENCES b (id),
    ref_c_id INTEGER REFERENCES c (id),
    CHECK(num_nonnulls(ref_a_id, ref_b_id, ref_c_id) <= 1) -- or = 1
)

Indeed, this is one that I've considered.

But how would Postgraphile know that it should generate

union ABC = A | B | C
type Example {
    …
    ref: ABC
}

I was thinking two ways: 1 would be a naming convention on the check constraint, the other would be a smart comment somewhere. Neither are ideal.

My idea, at least for interfaces, was to have a shared primary key (id) for all subtypes that uniquely identifies the row across all tables/types that implement the interface. (See my theInterface example above).

This is my fallback plan for interfaces; I'm hoping we might be able to achieve many of their features without requiring an additional table. It depends what the entry points are, I was thinking of just using custom types of the shape you describe above but where the custom type is tagged as an interface rather than a union. Interface tag would come with a list of shared fields, e.g. @interface id,name.

I definitely don't want to require table inheritance, it definitely seems to have fallen out of vogue.

That would need only a single

    ref_id INTEGER REFERENCES abc (id)

column in the example table, a @union or @interface annotation on the abc table, and database constraints to have each id of a, b or c exist in abc.

I'm planning to support this, it would be treated very similarly to the custom type solution I'm planning, except it'll live alongside the rest of the fields (so won't get "squashed").

Query approaches

Now back to how the generated queries should look. I figured there are different ideas:

1. Repeating the query.

SELECT json_build_object(
  'foo_x', foo.x,
  'foo_bar', (CASE bar_resolveType(foo.bar_id)
    WHEN 'user'::regclass::oid THEN (
      SELECT json_build_object('__typename', 'User'::text, 'shared', bar.shared, 'username', users.username)
      FROM bar JOIN users ON bar.id = users.id
      WHERE bar.id = foo.bar_id)
    WHEN 'posts'::regclass::oid THEN (
      SELECT json_build_object('__typename', 'Blog'::text, 'shared', bar.shared, 'title', posts.title)
      FROM bar JOIN users ON bar.id = posts.id
      WHERE bar.id = foo.bar_id)
    ELSE pg_raise('no concrete type found')
  END CASE)
)
FROM foo

It's rather simplistic, and has the same nesting of subqueries as an ordinary field. The FROM bar and WHERE bar.id = foo.bar_id parts are repeated in each of them. I have no idea how the query optimiser will deal with this. Probably not a good idea when using bar.shared fields.

Benchmarks required, don't trust your instincts here.

3. nested subqueries

SELECT json_build_object(
  'foo_x', foo.x,
  'foo_bar', (SELECT
    (CASE bar_resolveType(bar)
      WHEN 'user'::regclass::oid THEN (
        SELECT json_build_object('__typename', 'User'::text, 'shared', bar.shared, 'username', users.username)
        FROM users WHERE users.id = bar.id)
      WHEN 'posts'::regclass::oid THEN (
        SELECT json_build_object('__typename', 'Blog'::text, 'shared', bar.shared, 'title', posts.title)
        FROM posts WHERE posts.id = bar.id)
      ELSE pg_raise('no concrete type found')
    END CASE)
  FROM bar
  WHERE bar.id = foo.bar_id)
)
FROM foo

This appears to be basically the approach that you ended up with as well.

This is what I'd do.

Regarding bar_resolveType / ::regclass I wonder if there's any performance overhead of this approach. I was planning with the custom type approach to just do case when my_custom_type.foo_id is not null then <foo query> when my_custom_type.bar_id is not null then <bar query> ...

I believe this approach also is the best (easiest) for handling nested hierarchies, where bar_resolveType would return another interface instead of a concrete type

In GraphQL resolveType would always return a concrete type; what are you envisioning as a use-case for not doing so in PG?

  • it would simply need to be resolved recursively in the inner SELECT clause, and the bar fields simply stay in scope. (GraphQL schema cannot represent this yet, but we should support that use case nonetheless).

I was a the WG last week that discussed this spec. resolveType will still return a concrete type. The "interfaces implement other interfaces" is really a way of being more expressive.

One minor quibble I have though is that shared fields are repeated in the json_build_object calls.

:man_shrugging:

This might be fine for data directly generated from columns, but probably not for fields containing more complex stuff like subqueries. One attempt to fix this would be putting the shared fields in an extra object outside the CASE statement and merge it with the type-dependent fields:

I think this might be premature optimisation. I'd like to see benchmarks on this before we invested a lot of time into adding this level of complexity.

(unfortunately this requires a cast to jsonb to actually work).

jsonb is significantly slower (~4x IIRC) than json when used in PostGraphile, even if everything is made JSONB. Trust me, I did the benchmarks. I was rather disappointed.

Regarding the pg_raise('no concrete type found') (using some pg_raise function), we would use this only when the case statement match is exhaustive on all possible types that implement the interface / are contained in the union.

Raising an exception in the lookahead will make the entire query fail, not just this section. Instead, I'd return null. We embrace GraphQL's partial success features as much as we can.

I'm not sure what graphql-parse-resolve-info does here, will it consider all concrete types or only those that are necessary for the particular query (resolveInfo)? Otherwise, we would have to return

It only looks through what the user has written, but that has to have gone through GraphQL validation, so it will only contain possible values (otherwise you'd get a validation error).

Only the shared fields of the interface are fetched, the exact type doesn't matter. Well, at least not for the result, but I guess graphql's __resolveType will care about it, so that it can use the appropriate resolvers for the shared field - which might be different on the distinct types (but aren't in our case, they're all trivial resolvers). We'd need to spell out the bar_resolveType(bar) instead of null.

We definitely want to concretely indicate the type, otherwise it'll be a nightmare of complexity to handle.

I would also like to raise an exception if there was an entry for an id in the shared bar table, but no row in the respective subtype table - basically, if the implicit foreign key constraint was violated. I'm not certain how to tell the JOIN that it must find something though. I guess in the nested subqueries we could wrap some COALESCE(…, pg_raise('no concrete value found')) around the CASE statement (inside the SELECT … FROM bar), in the case of a join we could select COALESCE(users.tableoid, posts.tableoid, pg_raise('no concrete value found')) - maybe even as the default bar_resolveType.

Again, no PG exceptions. If the fields are nullable they'll all be null, if one is non-nullable then GraphQL will automatically throw an error for that entry in that list and replace it with null.

benjie avatar Sep 16 '19 19:09 benjie

I'm hoping you'll write up some learnings into the docs or a blog post or something that I can point future people at ;)

Yes, https://github.com/graphile/graphile-engine/issues/522 is the plan for today.

My idea, at least for interfaces, was to have a shared primary key (id) for all subtypes that uniquely identifies the row across all tables/types that implement the interface. (See my theInterface example above).

This is my fallback plan for interfaces; I'm hoping we might be able to achieve many of their features without requiring an additional table. It depends what the entry points are.

I see, I think that the "extra" table is actually the best way to model OOP inheritance where the subtypes have additional fields.

I was thinking of just using custom types of the shape you describe above but where the custom type is tagged as an interface rather than a union. Interface tag would come with a list of shared fields, e.g. @interface id,name.

You mean with CREATE TYPE for the shared columns? That seemed like a nice idea at first, but from what I can see you cannot assign any constraints on the shared columns inside the composite type. Neither ADD PRIMARY KEY ((interface).id) nor ADD FOREIGN KEY ((interface).id) REFERENCES … do work - you can only have the complete composite column be a key or reference something.

Benchmarks required, don't trust your instincts.

You're right of course, my concern was mostly about verbosity of the query (sure, that's not our primary goal) and the repetition of the shared field selections which cannot be avoided in the first approach.

Regarding bar_resolveType / ::regclass I wonder if there's any performance overhead of this approach. I was planning with the custom type approach to just do case when my_custom_type.foo_id is not null then <foo query> when my_custom_type.bar_id is not null then <bar query> ...

I see, this is probably the best/simplest approach when using that "one of multiple foreign key columns" structure for referencing an interface/union - I hadn't considered that before. Going for the tableoid would be my preferred approach for a) when the user has a "type" column somewhere (either in the referencing table itself, or an extra resolver table, or in the shared table of a hierarchy) that we'd want to use for the subtype lookup and b) when we have to do the union of all objects (e.g. the backreference of a shared column).

I believe this approach also is the best (easiest) for handling nested hierarchies, where bar_resolveType would return another interface instead of a concrete type

In GraphQL resolveType would always return a concrete type; what are you envisioning as a use-case for not doing so in PG?

I was thinking of a nested hierarchy

CREATE TABLE grandparent (
    id SERIAL PRIMARY KEY,
    type TEXT CHECK (type IN ('uncle', 'parent')),
    shared_grand TEXT
)
COMMENT ON TABLE grandparent IS '@interface'
CREATE TABLE uncle (
    id INTEGER PRIMARY KEY REFERENCES grandparent,
    …
)
CREATE TABLE parent (
    id INTEGER PRIMARY KEY REFERENCES grandparent,
    type TEXT CHECK (type IN ('childA', 'childB')),
    shared_par TEXT
)
COMMENT ON TABLE parent IS '@interface'
CREATE TABLE child_a (
    id INTEGER PRIMARY KEY REFERENCES parent,
    …
)
CREATE TABLE child_b (
    id INTEGER PRIMARY KEY REFERENCES parent,
    …
)

which should lead to

interface Grandparent {
    id: number!
    sharedGrand: string
}
interface Parent { // implements Grandparent
    id: number!
    sharedGrand: string
    sharedPar: string
}
type Uncle implements Grandparent {
     …
}
type ChildA implements Grandparent, Parent {
     …
}
type ChildB implements Grandparent, Parent {
     …
}

Here, the grandparent_resolveType might return the parent (tableoid), for further lookup (nested queries, but with common shared_par field lookup) in there. But you're absolutely right, for GraphQL we will always need to determine the fully concrete type somehow, even if the SQL query doesn't really care because only shared fields are queried.

I think this might be premature optimisation. I'd like to see benchmarks on this before we invested a lot of time into adding this level of complexity.

OK, I'll try without first. I guess this is related to https://github.com/graphile/graphile-engine/issues/2, and might require a refactoring of the QueryBuilder to allow select() calls with the other relation and a join predicate as extra arguments.

Raising an exception in the lookahead will make the entire query fail, not just this section. Instead, I'd return null. We embrace GraphQL's partial success features as much as we can. If the fields are nullable they'll all be null, if one is non-nullable then GraphQL will automatically throw an error for that entry in that list and replace it with null.

Oops, you're right. I guess we can still make the field resolver throw a descriptive exception when the concrete type or the subtype primary key is null, to get meaningful error messages.

ab-pm avatar Sep 17 '19 12:09 ab-pm

I was thinking of just using custom types of the shape you describe above but where the custom type is tagged as an interface rather than a union. Interface tag would come with a list of shared fields, e.g. @interface id,name.

You mean with CREATE TYPE for the shared columns? That seemed like a nice idea at first, but from what I can see you cannot assign any constraints on the shared columns inside the composite type. Neither ADD PRIMARY KEY ((interface).id) nor ADD FOREIGN KEY ((interface).id) REFERENCES … do work - you can only have the complete composite column be a key or reference something.

I was thinking more for returning from functions (e.g. a "search" function that searches multiple tables). I do not use composite columns.

Regarding bar_resolveType / ::regclass I wonder if there's any performance overhead of this approach. I was planning with the custom type approach to just do case when my_custom_type.foo_id is not null then <foo query> when my_custom_type.bar_id is not null then <bar query> ...

I see, this is probably the best/simplest approach when using that "one of multiple foreign key columns" structure for referencing an interface/union - I hadn't considered that before. Going for the tableoid would be my preferred approach for a) when the user has a "type" column somewhere (either in the referencing table itself, or an extra resolver table, or in the shared table of a hierarchy) that we'd want to use for the subtype lookup and b) when we have to do the union of all objects (e.g. the backreference of a shared column).

I am very wary of using OIDs directly; e.g. if you dump from one DB and import to another the OIDs won't necessarily match up. I don't think the actual OID should be written to a table anywhere.

benjie avatar Sep 17 '19 13:09 benjie

Ah, you mean when there is no shared table we could use a custom type to define the interface. Yes, that could work.

I am very wary of using OIDs directly

Yes, I didn't mean for storage. To store type references in database columns, I'd use TEXT (as in the examples above) or an ENUM. But inside a query or as the result of a "computed column" function they should be fine, since PostGraphile knows about the tableoids and the database knows about the tableoids I thought that would be the common denominator for identifying types (instead of using table names or inflected GraphQL type names - which require text comparison and might not be known in the database). I admit that TRUNCATE might pose a problem. (no, truncating can change file ids, not tableoids).

ab-pm avatar Sep 17 '19 14:09 ab-pm

Hi,

I'm not very well versed on the subject but what do you think about this :+1:

create schema private;

create table private.person (
  name text not null
);

-- declare interface
comment on table private.person is E'@Interface';

create table private.emailable (
  email text not null
);

-- declare and name interface
comment on table private.emailable is E'@Interface Emailable';

create table "user" (
  id serial not null primary key,
  name text not null,
  email text not null
);

create table team (
  id serial not null primary key,
  name text not null,
  email text not null
);

-- link User type to interface Person
create function person_from(u "user") returns private.person as $$
  select row(u.name)::private.person
$$ language sql;

-- link User type to interface Emailable
create function emailable_from(u "user") returns private.emailable as $$
  select row(u.email)::private.emailable
$$ language sql;

-- link Team type to interface Emailable
create function emailable_from(t "team") returns private.emailable as $$
  select row(t.email)::private.emailable
$$ language sql;

it's similar of how computed columns work, and the function could resolve the interface fields anyway they want it might support different existing inheritance schema. I'm not sure what the perf would be like or if it would work with interfaces implementing interfaces

mathroc avatar Sep 17 '19 15:09 mathroc

Hi @mathroc that seems like a nice idea. With CREATE TYPE person and CREATE TYPE emailable (instead of CREATE TABLE), I believe this is quite close to what @benjie was thinking of. I really like the declarativeness of the interface creation. Now the only thing that is missing is Postgres supporting CREATE TABLE (…) INHERITS (emailable, person) (where it currently complains "emailable is a composite type")…

While the functions bring great freedom into how a table implements an interface, the big downside is that there are now two ways to access a name field on a row:

SELECT json_build_object('name', "user".name) FROM "user";
SELECT json_build_object('name', person_from("user").name) FROM "user";

In PostGraphile, these probably would be a collision of two fields with the same name. This might require you to @omit the table fields and have virtual constraints on the interface type. I'd rather prefer PostGraphile to detect matching columns between the interface type and the table itself (given a COMMENT ON TABLE "user" IS '@implements person'), and require such a conversion function only iff they do not match.

ab-pm avatar Sep 17 '19 15:09 ab-pm

if I recall correctly, the issue with create type is that it makes all fields nullable

mathroc avatar Sep 17 '19 15:09 mathroc

We can circumvent that either using domains or smart comments. I believe @nonNull already works.

We should also keep in mind user-defined interfaces, or interfaces where the columns may not match perfectly but are "fudged" via smart comments or plugins.

benjie avatar Sep 17 '19 17:09 benjie

Just for reference, here's how JoinMonster does (did - the library is basically dead) support unions & interfaces: https://join-monster.readthedocs.io/en/latest/unions/

ab-pm avatar Sep 18 '19 09:09 ab-pm

Interesting, that means union types cannot share the names of columns with different types. E.g. if you had one table with id uuid primary key and one with id serial primary key you couldn't do a union of them. Quite a limited solution.

benjie avatar Sep 19 '19 07:09 benjie

From discord:

@benjie: What I’m planning to add is polymorphism where a single table can represent multiple types. E.g. animal/cat/dog/fish - some share attributes, some unshared.

I thought of that idea briefly but dismissed it because I didn't see a good way to create multiple types with overlapping columns declaratively. Maybe declare them as composite types elsewhere and have a @stores X, Y, Z annotation on the table? If you could come up with an example that I can run tests against that would be dope.

ab-pm avatar Nov 20 '19 10:11 ab-pm

Think of the blocks in a Medium-style post:

  • Paragraph (text)
  • Title (text)
  • Subtitle (text)
  • Image (url, text (caption), alt_text)
  • Link (url, text)
  • Video (provider, identifier, text (caption))

Perhaps:

type Block {
  id: Int!
  text: String
  author: User
  lastEdited: Datetime
}
type Image implements Block {
  id: Int!
  text: String
  url: String
  altText: String
}
type Title implements Block ...

From:

create table blocks (
  id serial primary key,
  type text default 'Paragraph',
  text text,
  author_id int references users,
  last_edited timestamptz not null default now(),
  url text,
  alt_text text,
  provider text,
  identifier text
);

or maybe a central block table with 1-to-1 relationships for the extra properties.

create table block_images (
  block_id int primary key references block,
  url text,
  alt_text text  
};

Maybe the relationship would be the other way so you can use an XOR check constraint to ensure there's exactly one related type - this may mean there's no nead for type column: if block_image_id is not null then it's an Image, etc.

I've not thought sufficiently deeply about this yet.

benjie avatar Nov 23 '19 14:11 benjie

If you have a concrete use case you should share it. We shouldn't build things for the sake of having them, we should build them because they're solving real problems people have.

We should also not be too prescriptive about people's database models - if we can make a particular solution work with multiple different layouts of database then that's better. Using patterns that are in the wild already is better because then it can benefit more people out of the box (and is not controversial).

benjie avatar Nov 23 '19 14:11 benjie

We implement a content management system for movies., series and such. A movie, episode, series etc pretty much share all the fields except for some columns. Right now we actually use Postgres inheritance but it is a pain because queries do not use FROM ONLY table and foreign keys must be set on all the tables.

Also relevant for a global search function that can query multiple types, as suggested at the start of this topic.

EyMaddis avatar Nov 23 '19 16:11 EyMaddis

Hello! My real-world use cases involve usergroups that can contain both individual users and other usergroups (looking for a union here), and something that's exactly similar to this structure (where I want an interface for the treenode type) but I can't share in further detail.

I've already started hacking, I guess I should have a PR to look at (though probably nowhere near readiness for merge) this week. As @benjie said, Postgraphile will need to be able to deal with any database layouts that people might have used. My current plan is to support the following patterns:

  • a selectable common table ("entity") with union/interface rows
    • "subclass" inheritance: each union member/interface implementing type subtable has a primary key that foreign-key-references the primary key of the union/interface table, i.e. sharing ids between all entities.
    • (any) unique foreign key references from the subtable to the common table, and some kind of "type resolver" in the common table to select the subtable
    • (nullable) unique foreign key references from the common table to the subtables, with one column per subtable and a XOR check constraint. Interfaces really are just treated the same as unions here regarding the 1:1 relation between common and subtable, except they have some shared columns in the common table (and a different smart tag…)
  • a custom composite type declaring an interface, with each table having its own columns storing its own values. Possibly with a customisable mapping from each table to the composite type. This is very similar to table inheritance, where each inherited table just "copies" the column definitions. Here, most queries would work out of the box, it just needs that implements Interface declaration on the type.
    • Shared foreign key constraints (forward relations in the interface fields) would have an option to generate a single backward relation to the interface type, instead of multiple backward relations to each implementing type.
  • unions for backward relations: multiple referencing table generating the backward fields with the same name become a single backward field with a union type on the foreign table
  • unions for forward relations: multiple foreign-key columns with a XOR check constraint become only a single forward field with a union type

Thanks for the examples you've come up with, I'll try to implement them in the test cases for demonstration.

ab-pm avatar Nov 25 '19 10:11 ab-pm

Having thought more about the queries for shared fields, I've come to the conclusion that it would be unnecessarily complicated trying to merge everything into a single object on the Postgres level already. There's no reason to do that, we're still using resolve() methods anyway (to resolve safe aliases) and they give us the flexibility to work with nested objects just as easily. So it'll be more like

SELECT json_build_object(
  'foo_x', foo.x,
  'foo_bar', (SELECT json_build_object(
    'shared', bar.shared,
    '__concrete', (CASE bar_resolveType(bar)
      WHEN 'user'::regclass::oid THEN (
        SELECT json_build_object('__typename', 'User'::text, 'username', users.username)
        FROM users WHERE users.id = bar.id)
      WHEN 'posts'::regclass::oid THEN (
        SELECT json_build_object('__typename', 'Blog'::text, 'title', posts.title)
        FROM posts WHERE posts.id = bar.id)
      ELSE pg_raise('no concrete type found')
    END CASE)
  )
  FROM bar
  WHERE bar.id = foo.bar_id)
)
FROM foo

Now I only need to adopt parseResolveInfo/simplifyParsedResolveInfoFragmentWithoutType to be able to support this structure :-)

ab-pm avatar Nov 26 '19 14:11 ab-pm

Ah, found a fun problem: assuming we have concrete object types X, Y and Z which implement the interface Bar with a shared field (stored in a common referenced table), a GraphQL query can be quite flexible and make subfield queries type-dependent:

query {
  allBars {
    __typename
    shared { a }
    ... on X {
      shared { b d }
    }
    ... on Y {
      shared { c d }
    }
  }
}

It could lead to a result such as

{ data: [
  { __typename: "X", shared: { a: 1, b: 2, d: 4 } }, // no c
  { __typename: "Y", shared: { a: 1, c: 3, d: 4 } }, // no b
  { __typename: "Z", shared: { a: 1 } }, // no b or c
] }

The fields a through d could of course stem from complicated subqueries not be simple scalars. What would the SQL query look like?

  • avoid repetition and just do overfetching: have a single select for the shared field that gets all of a, b, c and d on each shared object, no matter what concrete type it has
  • avoid repetition, and propagate the type conditions down the tree: have a single select for the shared field but then wrap the queries for b, c and d in CASE expressions depending on the parent's concrete type
  • repeat the queries, just like in the graphql code itself: do separate selects for the shared field on all objects, for the shared field on all objects satisfying X (which might be an interface not a concrete type!), and for the shared field on all objects satisfying Y, then do a recursive merge of the results (either in SQL using jsonb, or in JS) - it will be a hassle for list fields though.
  • repeat the queries for every possible concrete type/combination of type conditions, i.e. statically evaluating the possible merges beforehand - this could lead to combinatorial explosion though

I tend to favour the first approach (overfetching) because it's the simplest and requires the fewest code changes, then later implement the second approach (condition propagation) if anyone deems it necessary.

ab-pm avatar Nov 27 '19 10:11 ab-pm

If you have a concrete use case you should share it

Every entity in my system inherits from a base table that is more or less:

create table if not exists internal.base (
  entity_id       uuid default uuid_generate_v1mc(),
  valid_from      timestamp with time zone not null default now(),
  valid_until     timestamp with time zone not null default 'infinity',

  -- this is not inherited, and has to be copied
  primary key (entity_id, valid_until)
  unique (entity_id, valid_from)
);

This is mostly for convenience - I would eventually like to move away from inheritance at the db level. However, dealing with the universally implemented interface at the postgraphile level would be a win for me even without table inheritance.

I think an ideal extension that satisfies my use case would just take an interface definition, along with a matcher function by which to apply that definition. Essentially a generalized version of the approach used in determining shouldHaveNodeId. I can't tell if table constraints can be gotten in user land though. Maybe through context?

Imagined usage would be something like:

appendPlugins: [implementInterface({
  interface: interfaceDefinition,
  when: (type, table) => 
    hasMyInterfaceFields(type) &&
    hasMyInterfaceConstraints(table),
})]

micimize avatar Dec 08 '19 16:12 micimize

@micimize Ah, cool, thanks for the ideas! I guess this actually should be quite possible already by hooking onto GraphQLObjectType:interfaces. Indeed, the table constraints are available through context.scope.pgIntrospection.constraints to all plugins - there's no distinction made between builtin and "userland" ones. The only thing missing here is creating the interface type itself, and accessing it from inside the GraphQLObjectType:interfaces hook. The first one could be achieved by a makeExtendSchemaPlugin or maybe even raw new GraphQLInterfaceType() instantiation. (Creating it from postgres' internal.base table takes a bit more effort, but that part of the code I have working already). hasMyInterfaceFields will be the most complicated part here, as inspecting the fields and their types might mess with lazyness of the field hooks (cf. #575). But if your tables do already have all these columns themselves, lookahead features will not be affected by the interface declaration, the object types already work for you, so this will work out of the box.

ab-pm avatar Dec 09 '19 08:12 ab-pm

I got my first (specific) union type with complete lookahead working, see https://github.com/graphile/postgraphile/issues/1210#issuecomment-567095994! So it definitely seems possible, and I hope I can soon make the PR to cut the rough edges in the graphile core that would make this even easier. Then the step to a generic support for generating these types for arbitrary annotated tables won't be large any more…

ab-pm avatar Dec 18 '19 16:12 ab-pm

Note that table inheritance is listed in the PostgreSQL wiki as a "Don't do this".

RedShift1 avatar Feb 29 '20 14:02 RedShift1

Even without table inheritance, the possibility to return GraphQL union types in a resolver would be really nice to have! I hope a generic solution based on the solution by @ab-pm will be available soon...

ulo avatar Mar 11 '20 08:03 ulo

I just spent a bit of time trying to figure this out, but I am sort of stuck on trying to figure out the graphile code base. I would solve it the following, and would like to know if you think it's doable that way:

We want a schema like this

interface foobar 
{ 
    name: String
    lowerCase: String
}

type foo implements foobar 
{
    name: String!
    id: Int! 
}

type bar implements foobar 
{
    name: String!
    id: Int! 
}

Pure Interface Types

To implement inheritances us the @implements & @interface smart comments.

What we want is a schema that looks like this:

interface foobar { name: String }

type foo implements foobar 
{
    name: String!
    id: Int! 
}

type bar implements foobar 
{
    name: String!
    id: Int! 
}

If we just want the interface type, i.e. no query , then using a type like this should work:

Here is an example of using a types, i.e. one that cannot be queried in Query.

create type p.foobar as(name text);
comment on type p.foobar is '@interface';

create table p.foo (id serial primary key, name text not null);
comment on table p.foo is '@implements foobar';

create table p.bar (id serial primary key, name text not null);
comment on table p.bar is '@implements foobar';

Another version would be to have a table that can represent two different types, and I want to represent them like this in the schema;

create table p.foobar (id serial primary key, name text not null, is_foo boolean);

comment on table p.foobar is '@interface';
comment on column p.foobar.is_foo is '@omit'; 

create view p.foo as select id, name from p.foobar where is_foo;
comment on view p.foo is '@implements foobar';

create view p.bar as select id, name from p.foobar where not is_foo;
comment on view p.bar is '@implements foobar';

And as third option you could put two different tables in a common view, like so

create table p.foo (id serial primary key, name text not null);

create table p.bar (id serial primary key, name text not null);

--tableoid is required to determine the type
create view p.foobar as
    select id, name, tableoid from p.foo
union all
    select id, name, tableoid from p.bar;

comment on view p.foobar is '@interface';

comment on table p.foo is '@implements foobar';
comment on table p.bar is '@implements foobar';

Any thoughts on the design idea?

klemens-morgenstern avatar May 28 '20 03:05 klemens-morgenstern

@klemens-morgenstern it’s nice to see alternatives, I like the third option. The second one does not allow implementing multiple interfaces, so I think that would be a problem.

The first one could work but I like the third one better because it allows transforming data from the source table to match the interface properties. On your example, for Postgraphile to be able to generate the correct GraphQL interface, you would remove the id column from the p.foobar view, right ?

I had a different suggestion that would like this in your example:

create table p.foo (id serial primary key, name text not null);
create table p.bar (id serial primary key, name text not null);
create type p.foobar as(name text);
comment on type p.foobar is '@interface';

create function foobar_from(foo) returns p.foobar as $$
  select row($1.name)::p.foobar
$$ language sql;

create function foobar_from(bar) returns p.foobar as $$
  select row($1.name)::p.foobar
$$ language sql;

from a maintenance point of view I like having multiple function instead of a global view but I don’t know if there’s performance implication from either implementation

mathroc avatar May 28 '20 12:05 mathroc

I would like to have both 2 & 3. If you break it down to something that is query-able, i.e. a view, matieralized view, table, and has a tableoid, you would have the most flexibility. You could also use table inheritance, should you so desire.

In my example the id would not be removed, but graphile could use id & tableoid to generated a unique nodeId.

I also tried to implement a prototype myself, but already failed on changing a type to an interface with hook. This whole engine is really complicated.

klemens-morgenstern avatar May 28 '20 16:05 klemens-morgenstern

I suggest you familiarise yourself with GraphQL.js before diving into the Graphile Engine codebase. We don’t have a way of changing the constructor of a type (GraphQLObjectType, GraphQLInterfaceType, etc), the engine is not designed to do that.

benjie avatar May 29 '20 07:05 benjie

I got my first (specific) union type with complete lookahead working, see #1210 (comment)! So it definitely seems possible, and I hope I can soon make the PR to cut the rough edges in the graphile core that would make this even easier. Then the step to a generic support for generating these types for arbitrary annotated tables won't be large any more…

Any updates on this PR?

heri16 avatar Sep 24 '20 11:09 heri16