crystal Documentation Improvement: subscriptions' effect on null nodes in connections

Summary

I encountered an issue (really more of a behavior quirk than an actual problem) described by the following:

On an automatically-generated-from-fk connection in postgraphile, I'm getting a null node returned when all the values I'm asking for are null. For example, if I run the following query

class(id=1) {
  students {
    nodes {
      name
    }
  }
}

I'm getting nodes: [ null ] in the response for the row where there are other values, but name is null. If I include some of those other values, then it returns an object like { "name": null, "age": 12 }

I asked about this behavior in discord, and Benjie mentioned that setting subscriptions: true would alter this to produce the desired behavior I described. This worked! But it wasn't documented anywhere that I could find, so at least on github issues, it's a more searchable forum than discord.

I'm also curious about why the subscriptions flag has an effect on this.

Additional context

For context, I read through https://www.graphile.org/postgraphile/subscriptions/, https://www.graphile.org/postgraphile/connections/, and https://www.graphile.org/postgraphile/why-nullable/ none of which contained this information that I could find.

Sep 23 '20 04:09 jcgsville

This is an interesting one.

In PostgreSQL, IS NULL on a composite type (e.g. a row type) checks that every column is null, OR the record is null. So

[test] # create type example as (id int, label text);
CREATE TYPE
Time: 13.734 ms
[test] # select (null, null)::example is null;
┌──────────┐
│ ?column? │
├──────────┤
│ t        │
└──────────┘
(1 row)

Time: 0.281 ms
[test] # select null::example is null;
┌──────────┐
│ ?column? │
├──────────┤
│ t        │
└──────────┘
(1 row)

Time: 0.209 ms

You can tell the difference between these cases though:

[test] # select (null, null)::example is distinct from null;
┌──────────┐
│ ?column? │
├──────────┤
│ t        │
└──────────┘
(1 row)

Time: 0.216 ms
[test] # select null::example is distinct from null;
┌──────────┐
│ ?column? │
├──────────┤
│ f        │
└──────────┘
(1 row)

Time: 0.201 ms

When calling PostgreSQL functions (and in some other situations), if you state that you're returning example or SETOF example and you actually return a NULL, then that sometimes gets cast into (null, null)::example and sometimes to null::example (often depending on the stored procedure language). So in general, PostGraphile cannot differentiate between these two because one often morphs into the other.

So, to determine if a record returns is an actual record, or is null, PostGraphile checks that every column is null. If it is, then the record is treated as if it were null, otherwise it doesn't.

This is where PostGraphile's efficiency bites us; because PostGraphile only requests what you've asked for, PostGraphile only requests the name column, and when we get the results back it's null so we think the row is null because every column is null.

This doesn't impact many people, because in GraphQL queries that clients use it is best practice to always fetch the id so that your GraphQL client can use it as the cache identifier along with __typename. And id is composed of the primary key, which of course is never null in a record in a table, so the problem goes away when you request the id.

When I added subscriptions support to PostGraphile, I needed the primary keys to always be present whether the user requested them or not; however I couldn't just always request them because there could be users out there who have column-level select grants and don't grant select on the primary key. It'd be bizarre... but the things I've seen... :see_no_evil:

So I thought; why not solve two problems with one flag. I made it so when you opt into subscriptions (a feature that didn't previously exist) you also opt into always requesting the primary keys. So subscriptions is now an overloaded flag that happens to solve this issue. And this is why subscriptions: true fixes this issue. You can also fix it by always requesting at least one of the primary keys for the record.

Help adapting this explanation into the docs in a less long-winded form would be welcome ;)

TL;DR: PostgreSQL NULLs are weird; we can't reliably in all situations tell the difference between a null record and a record made only of nulls, we (generally) only select what you ask us to, so make sure you always fetch something non-nullable (e.g. a primary key), or enable the subscriptions flag and we'll do it for you.

Sep 23 '20 10:09 benjie

Sorry that it's a year and a half later, but thanks for your thorough response! I'm happy to add some of this information to the documentation. Where might be a good place for it? https://www.graphile.org/postgraphile/why-nullable/? A new page?

Mar 12 '22 07:03 jcgsville

Maybe a new page called "quirks"?

Mar 12 '22 09:03 benjie

This issue is resolved in V5 and is not a problem in V4 if you use subscriptions: true; so I'm going to close it.

Oct 04 '23 17:10 benjie

This issue is resolved in V5 and is not a problem in V4 if you use subscriptions: true; so I'm going to close it.

Makes sense. Thanks for all the great work you put into v5 :)

Feb 25 '24 05:02 jcgsville

crystal crystal copied to clipboard

Documentation Improvement: subscriptions' effect on null nodes in connections

Summary

Additional context

crystal
crystal copied to clipboard