juniper icon indicating copy to clipboard operation
juniper copied to clipboard

Efficient generation of SQL queries

Open bostontrader opened this issue 4 years ago • 13 comments

Hello folks,

I would like to start a discussion about a general overview about how to connect GraphQL to SQL. I'm not looking for specific code for specific products (such as Rocket, Diesel, Postgress, MySQL) but am instead interested in the general overview of how we can approach the problem of generating efficient SQL queries from Juniper's GraphQL.

After quite a bit of search I have very surprisingly found almost no information about this general topic. The best I have found is this four part series: GraphQL/MySQL/Node. Here the author talks about working with the GraphQL AST that is generated by some other product as part of generating SQL. He also talks about dealing with joins. Unfortunately I don't see any way to hook into this with Juniper.

Another clue that I have is nom-sql. Although this is a parser for SQL it necessarily has all the objects needed for an SQL AST. So it should be rather straight-forward to programmatically build an SQL query from a GraphQL query. Again, alas, I don't see anyway to hook into this with Juniper w/o doing some serious hacking.

So I'm interested in hearing from those of you who understand this code much better than I, how to approach this issue.

bostontrader avatar Sep 10 '20 22:09 bostontrader

Hmmm. If you are talking about generating the correct SQL query in resolvers, there is dataloader-rs and juniper-eager-loading as two strategies to prevent inefficiencies.

If you want to translate an SQL query to a GraphQL query, you don't need juniper at all as it would just take in the translated GraphQL query.

If you want to translate the GraphQL schema to a SQL tables (similar to hasura), you can get the schema as a graphql-parser AST and translate from that.

What do you want to do?

LegNeato avatar Sep 11 '20 08:09 LegNeato

@LegNeato, thanks for the insight. I have what I guess is a very common issue. I have a bunch of MySQL tables that I want to access and I want to be able to provide a GraphQL interface. I'm using the Rocket framework.

Given the GraphQL query, I want to build a single SQL query that will do what I want. I think that given access to the GraphQL AST I can do this and thus graphql-parser looks like a promising lead. I haven't studied dataloader-rs in sufficient depth to conclusively rule it out but my initial impression is that it's too complicated to figure out for my application. (Not that I cringe at figuring out complicated things, but, you know...) At this time, producing the One SQL query strikes me as the easiest path forward.

Given all this new found learning, I'm going to guess that the best way to do this would be inside the http route handler for GraphQL. An example:

#[rocket::post("/graphql", data = "<request>")]
fn post_graphql_handler(
    context: State<M::JunDatabase>,
    request: juniper_rocket::GraphQLRequest,
    schema: State<Schema>,
) -> juniper_rocket::GraphQLResponse {

    request.execute_sync(&schema, &context)
}

At this point I have my db connection available in context. I will guess that the GraphQL AST should be available from GraphQLRequest and I could therefore make my SQL here, and feed it to the GraphQL resolver via the context. This strikes me as kinda hacky and contrary to the GraphQL Way but I think it would work.

Failing that, the next place to hook in might be in the resolver itself. An example snippet:

#[juniper::graphql_object(
    Context = JunDatabase
    Scalar = juniper::DefaultScalarValue,
)]

// The root query object of the schema
impl Query {

    #[graphql(arguments(category_ids(description = "A list of relevant category ids.")))]
    fn distributions(&self, database: &JunDatabase, category_ids: Vec<i32>) -> String {
        // blah, blah, blah...
    }
}

The only entre into GraphQL here is via &self. Is there any access to the AST from here?

Thanks again for the insight.

bostontrader avatar Sep 14 '20 22:09 bostontrader

I'm new to rust so this might be horrendous code, but i got a query only selecting the required fields working.

I hope this helps.

let client = ctx.db.get().await.unwrap();
let mut fields: Vec<&str> = Vec::new();
let mut field_map: HashMap<&str, usize> = HashMap::new();

let mut insert_field = |field| {
    if !field_map.contains_key(field) {
        field_map.insert(field, fields.len());
        fields.push(field);
    }
};

let look_ahead = executor.look_ahead();
for child in look_ahead.children() {
    let child_name = child.inner_name();
    if child_name != "__typename" {
        insert_field(child_name);
    }
}

let result = client
    .query(
        &format!(
            "SELECT {} FROM entry ORDER BY time DESC OFFSET $1 FETCH NEXT $2 ROWS ONLY",
            fields.join(", ")
        )[..],
        &[&((page * per_page) as i64), &(per_page as i64)],
    )
    .instrument(tracing::info_span!("query_entries"))
    .await;

I had to make a change to juniper, tho. https://github.com/graphql-rust/juniper/compare/master...tuco86:master

tuco86 avatar Sep 29 '20 23:09 tuco86

@bostontrader In your last comment, you have the full query in your post_graphql_handler, so you should have all the info needed to generate your SQL query, and then couldn't you just pass it to the resolver through (a somewhat more complicated) Context?

I'm a little confused about this thread -- are you also looking for tips on how to programmatically generate a SQL query for an arbitrary nested GraphQL query, or are you more interested in the nuts and bolts of how to hook into Juniper?

For reference, I've been doing some work around having my schema source of truth in Rust structs, and then automatically generating as much as possible (SQL schema, boilerplate parts of SQL and GraphQL queries, Apollo Client frontend code) from that for simple flat queries that return these structs. I don't yet have a need to do this for more complex nested graph queries, and the SQL I need there is esoteric enough that I write it by hand anyway.

trevyn avatar Sep 30 '20 00:09 trevyn

Generally this is called the "N+1" problem in GraphQL.

Juniper provides lookaheads and it is up to you to decide how best to implement your mitigation logic.

There are roughly two strategies folks have standardized on for GraphQL:

  1. Dataloaders, where each user gets their own mini cache that only lasts for the request.
  2. Eager loading, where all data for the whole query is loaded up front / at the top-level object and then passed down to the resolvers. The resolvers merely return data that has been already queried "above" them.

We have docs on dataloaders at https://graphql-rust.github.io/juniper/master/advanced/dataloaders.html and there is https://docs.rs/juniper-eager-loading/0.5.1/juniper_eager_loading/. Raw lookaheads don't have many docs right now so if you go that route please contribute some back!

LegNeato avatar Oct 06 '20 07:10 LegNeato

@bostontrader not sure if you are still interested, but this might help https://github.com/prisma/quaint

jmsegrev avatar Jan 13 '21 12:01 jmsegrev

@jmsegrev Thanks for the tip! This looks very interesting. I'll take a closer look.

bostontrader avatar Jan 13 '21 20:01 bostontrader

@LegNeato have you seen anything that would take a graphql query like in the example in juniper_eager_loading

query SomeQuery {
    allUsers {
        country {
            id
        }
    }
}

and turn it into

select c.id from users u
join countries c on u.country_id=c.id 

and I'm mainly talking about doing the join instead of using a caching solution like a dataloader or query created by eager loading ie.

select * from users
select * from countries where id in (?, ?, ?, ?)

This is something that I try to find a solution to every time I look into graphql but I can't seem to find it.

triptec avatar Sep 22 '21 21:09 triptec

Seems I'm looking for something like https://github.com/join-monster/join-monster for rust.

triptec avatar Sep 22 '21 22:09 triptec

So I've been digging through Prisma's code, and I found Prisma's core Rust engines, which includes:

  1. query-engine which parses a couple different formats, including GraphQL, into SQL code for all supported platforms,
  2. migration-engine-cli which takes Prisma's model files (which look suspiciously like a cross between SQL and GraphQL schema definitions) and translates them to SQL DDL
  3. introspection-engine which can query SQL databases to pull out their schemas

The only real reasons as to why this isn't already being used in Juniper and async-graphql servers as CRUD automation for databases is that it uses Prisma schema language (no actual GraphQL schemas) and that the query/mutation syntax is... Odd to say the least.

Still, this is really intriguing.

ronasuti avatar Nov 02 '21 16:11 ronasuti

To throw in another interesting project with a similar solution into the mix: https://github.com/graphprotocol/graph-node/tree/master/graphql%2Fsrc

"The Graph" indexes and persists Ethereum data, which is a slightly different use-case, but essentially offers a dynamic GraphQL API on top, translating requests to SQL.

adzialocha avatar Nov 25 '21 21:11 adzialocha