swapi-graphene icon indicating copy to clipboard operation
swapi-graphene copied to clipboard

Querying for allSpecies and homeworld executes 38 SQL queries

Open simonw opened this issue 7 years ago • 7 comments

This query:

{
  allSpecies {
    edges {
      node {
        name
        homeworld {
          name
          gravity
        }
      }
    }
  }
}

Executes 38 sql queries, looking something like this:

(0.001) SELECT COUNT(*) AS "__count" FROM "starwars_species"; args=()
(0.000) SELECT "starwars_species"."id", "starwars_species"."created", 
"starwars_species"."edited", "starwars_species"."name", 
"starwars_species"."classification", "starwars_species"."designation", 
"starwars_species"."average_height", "starwars_species"."skin_colors", 
"starwars_species"."hair_colors", "starwars_species"."eye_colors", 
"starwars_species"."average_lifespan", "starwars_species"."homeworld_id", 
"starwars_species"."language" FROM "starwars_species" LIMIT 37; args=()

(0.000) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" = 9; args=(9,)

(0.000) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" = 14; args=(14,)

... repeat another 36 times

Full list here: https://gist.github.com/simonw/762a98de85a2f113d1d7ec6026f73ee4

Is it possible to detect this case and then use select_related() or prefetch_related() here to join against the relevant table and get everything done in just one or two queries?

If so, an example showing how to do that would be a fantastic illustration of a slightly more complex use-case for graphene-django (I tried myself and couldn't figure out how to do it).

simonw avatar Jul 07 '17 22:07 simonw

I generated the list of SQL queries by adding the following configuration to swapi_graphene/settings.py

LOGGING = {
    'version': 1,
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
        },
    },
    'loggers': {
        'django.db.backends': {
            'level': 'DEBUG',
            'handlers': ['console'],
        },
    },
}

simonw avatar Jul 07 '17 22:07 simonw

Even better: is it possible to detect the fields being requested and update the SELECT fields in those statements, using the Django .only() QuerySet method?

My ideal SQL queries for this GraphQL query would be either this (using prefetch_related):

SELECT COUNT(*) AS "__count" FROM "starwars_species";

SELECT "starwars_species"."id", "starwars_species"."name", "starwars_species"."homeworld_id",
FROM "starwars_species" LIMIT 37;

SELECT "starwars_planet"."name", "starwars_planet"."gravity"
FROM "starwars_planet"
WHERE "starwars_planet"."id" in (
  9, 14, 23, 24, 28, 29, 31, 7, 33, 18, 8, 34,
  35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47,
  48, 49, 50, 11, 51, 54, 55, 10, 56, 57, 58, 59, 12
);

Or this (using a select_related() join):

SELECT COUNT(*) AS "__count" FROM "starwars_species";

SELECT "starwars_species"."id", "starwars_species"."name", "starwars_species"."homeworld_id",
"starwars_planet"."name", "starwars_planet"."gravity"
FROM "starwars_species"
JOIN "starwars_planet" ON 
  "starwars_species"."homeworld_id" = "starwars_planet"."id"
LIMIT 37;

simonw avatar Jul 07 '17 22:07 simonw

On closer inspection of the docs, it looks like the solution here is to use a DataLoader - any chance of some example code for that? http://docs.graphene-python.org/en/latest/execution/dataloader/

simonw avatar Jul 10 '17 17:07 simonw

I got this working using a dataloader: https://github.com/simonw/swapi-graphene/commit/cea04512abf01e929d50f050e3c690a2a7117565

Corresponding SQL:

(0.001) SELECT COUNT(*) AS "__count" FROM "starwars_species"; args=()
(0.001) SELECT "starwars_species"."id", "starwars_species"."created", 
"starwars_species"."edited", "starwars_species"."name", 
"starwars_species"."classification", "starwars_species"."designation", 
"starwars_species"."average_height", "starwars_species"."skin_colors", 
"starwars_species"."hair_colors", "starwars_species"."eye_colors", 
"starwars_species"."average_lifespan", "starwars_species"."homeworld_id", 
"starwars_species"."language" FROM "starwars_species" LIMIT 37; args=()
(0.002) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" IN (9, 14, 23, 24, 28, 29,
31, 7, 33, 18, 8, 34, 35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47, 48, 49, 50,
11, 51, 54, 55, 10, 56, 57, 58, 59, 12); args=(9, 14, 23, 24, 28, 29, 31, 7,
33, 18, 8, 34, 35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47, 48, 49, 50, 11,
51, 54, 55, 10, 56, 57, 58, 59, 12)

simonw avatar Jul 10 '17 17:07 simonw

Oops, correct resolver on the Specie class looks like this:

    def resolve_homeworld(self, *args):
        if self.homeworld_id:
            return planet_loader.load(self.homeworld_id)
        else:
            return None

simonw avatar Jul 10 '17 18:07 simonw

I think that's wrong too. The PlanetLoader batch_load_fn(keys) method needs to return objects for the keys in the same order that the keys were provided.

simonw avatar Jul 10 '17 18:07 simonw

OK, I think this is the correct implementation of the DataLoader:

from promise import Promise
from promise.dataloader import DataLoader


class PlanetLoader(DataLoader):
    def batch_load_fn(self, keys):
        objects_by_key = {
            o.pk: o
            for o in models.Planet.objects.filter(pk__in=keys)
        }
        results = []
        for key in keys:
            results.append(objects_by_key.get(key, None))
        return Promise.resolve(results)

planet_loader = PlanetLoader()

simonw avatar Jul 10 '17 18:07 simonw