router
router copied to clipboard
Enforce resource limits during query validation: pagination limits, query cost calculations
I would like the ability to enforce pagination limits at the router level, without delegating to subgraphs
Here's an example:
query Comments($cursor: String) { Comments(first: 99999999999999999, after: $cursor) { ...
In other words, I am looking to implement a basic version of https://docs.github.com/en/graphql/overview/resource-limitations at the router level
Related discussion in https://github.com/apollographql/router/discussions/1246
Hi and thanks for raising this issue. I don't think this is a feature which would be available by default in the router. It seems very specialised. I've put some details into the discussion you started.
Thank you @garypen for the detailed response in the discussion thread.
When you say it won't be available by default in the router, do you mean any sort of resource limiting such as points-based query cost, quotas or rate limiting won't be implemented at the router level by default?
Are router users then expected to delegate resource control to subgraphs or is there something else to try? (other than writing a non-trivial custom plugin)
I'm not saying the router won't support any kind of resource limiting, it was more a comment on this exact form of resource limiting. Best practice is still an emerging concept in this area (although, I note that https://ibm.github.io/graphql-specs/cost-spec.html looks interesting) and I think it will take a while to settle down yet.
I do believe there is a role for a resource limiting mechanism in the router at some point in the future.
Thank you @garypen for the detailed response in the discussion thread.
When you say it won't be available by default in the router, do you mean any sort of resource limiting such as points-based query cost, quotas or rate limiting won't be implemented at the router level by default?
Are router users then expected to delegate resource control to subgraphs or is there something else to try? (other than writing a non-trivial custom plugin)
Just out of curiosity, how do you expect this to be implemented without implementing a custom plugin in the future? This seems to be a very specialized thing. For my company we handle query validation at the router level ensure that the query the user is sending is a "registered" query (sorta defeats the purpose of graphql in a sense). Then, we do variable validation using graphql directives on the subgraphs.
For example, we have a pagination directive that specifies a maximum page size on specific fields, the validation happens during query analysis stage. For context, we use gqlgen.
@LockedThread:
we have a pagination directive that specifies a maximum page size on specific fields, the validation happens during query analysis stage.
Are you also analyzing in a way that's similar to the style that @andrew-kolesnikov used? (e.g., are you using the Connection specification or something custom?)
If you could provide an example, that would be great. This use case certainly sounds interesting!
Adding @lleadbet for visibility. Our current challenge has grew beyond pagination limits - we're looking to port query cost calculations from nodejs - we're looking for something like https://github.com/slicknode/graphql-query-complexity for the Rust router. That is something that I think could benefit a lot of folks too, so any suggestions would be much appreciated.
Ping @lennyburdette I think he's currently experimenting with a related topic
@LockedThread:
we have a pagination directive that specifies a maximum page size on specific fields, the validation happens during query analysis stage.
Are you also analyzing in a way that's similar to the style that @andrew-kolesnikov used? (e.g., are you using the Connection specification or something custom?)
If you could provide an example, that would be great. This use case certainly sounds interesting!
I apologize for the lack of responsiveness to this, I guess it just slipped through my notifications. Anyways, a public example of this use case in a project I am apart of is this:
Schema Definition: https://github.com/KnightHacks/knighthacks_users/blob/12ce0bb9b608091f6517fe071e11a848908c28db/graph/schema.graphqls#L12
Example Usage: https://github.com/KnightHacks/knighthacks_users/blob/12ce0bb9b608091f6517fe071e11a848908c28db/graph/schema.graphqls#L130
For the actual pagination we use the Connection specification. On one of my previous projects I tweaked the Connection specification to allow for circular pagination along with using this strategy.
We use https://github.com/99designs/gqlgen for our graphql server library.
I just had an idea of how to implement this limit. For every field/query/mutation on the graph there would be cost value associated with it and a max cost defined in the router config. Potentially it could be implemented with directives, I am not sure if the visibility would be great for prospective attackers.
I am sure there are other systems that accomplish this but it hasn't been done with any federated systems because these are all implemented in the graphql server libraries themselves.
Examples
Schema Definition (modified version of this)
directive @cost(
value: Float,
calculatedValue: String
) on FIELD_DEFINITION | ARGUMENT_DEFINITION | INPUT_FIELD_DEFINITION | ENUM_VALUE
type ProductVariation {
id: ID!
}
type ProductDimension {
size: String
weight: Float
# Does math so the cost is higher
volume: Float @cost(value: 10.0)
}
type Product {
id: ID!
sku: String
package: String
# Variation and Dimensions are fields that require another
# database call in the subgraph, therefore their cost is higher.
variation: ProductVariation @cost(value: 2.0)
dimensions: ProductDimension @cost(value: 2.0)
}
type ProductConnection {
totalCount: Int
pageInfo: PageInfo!
products: [Product!]!
}
type PageInfo {
startCursor: String!
endCursor: String!
}
type Query {
# Uses the calculatedValue directive field which supports math and string interpolation with variables
products(first: Int!, after: ID): ProductConnection! @cost(calculatedValue: "1.2 * $first")
product(id: ID!): Product @cost(value: 1.0)
}
Query (Cost: 1.0)
product(id: "abc123") {
id
sku
package
}
Query (Cost: 5.0)
product(id: "abc123") {
id
sku
package
variation
dimensions
}
Query (Cost: 264)
products cost: 1.2 * 20 = 24 dimensions cost: 2.0 * 20 = 40 volume cost: 20 * 10 = 200
products(first: 20) {
id
sku
package
dimensions {
volume
}
}
I published https://github.com/apollosolutions/router-basic-operation-cost yesterday as a starting point for demonstrating cost and depth limiting. I’m not at all satisfied with the cost analysis (it doesn’t take lists or abstract types into account) but hopefully it’s somewhat helpful.
Does anyone in this thread have any feedback they want to share about @lennyburdette's solution?
I have feedback! It's not really a query cost analysis algorithm, it's really just a weighted field counter. Flaws include:
- it doesn't implement the field collection algorithm so it will count repeated fields (e.g. from different fragments) unnecessarily
- it doesn't take lists into account, so it will severely undercount cost when fields are resolved repeatedly
- it doesn't take abstract types into account
- it doesn't take
@skip
or@include
into account - and i'm still waiting for the improvement to apollo-compiler that lets us avoid re-parsing the entire schema each request