lighthouse
lighthouse copied to clipboard
Select only necessary columns for optimized queries
What problem does this feature proposal attempt to solve? I am trying out GraphQL, testing various features in order to include it in some of the web apps I am developing with laravel. Being curious, I created a Schema and fired up all sorts of queries, followed by poking around in the SQL server to see what was going on.
What I discovered was this:
Suppose I have a Type defined which corresponds to a Model, and the database table has an ID column and 26 more columns (A, B C, ..., Z).
No matter what the Type definition is, which in my case only includes some of the columns ( say, ID and A, B, C), I can see that the SQL server is asked to fetch all 27 columns from the DB, using a SELECT * FROM [Table] statement, even though given the definition of the type at most 4 of them would be returned via the resolver.
This is an unnecessary data transfer between the DB and PHP, in my opinion.
Which possible solutions should be considered?
I propose that we construct a more refined SQL query, selecting only the columns that are requested in the GraphQL query.
I am not sure I am competent enough to try and implement this on my own, but I will give it a shot. In any case, I thought I'd share my thoughts.
Technically, this can be done by constraining the SELECT
to the requested fields. The fourth resolver argument ResolveInfo
can be used to know which fields a client selected.
There are some pitfalls to this which require discussion:
- Fields may not match database columns 1-1
- Laravel assumes
SELECT *
by default for things like relations, virtual properties, etc.
Technically, this can be done by constraining the
SELECT
to the requested fields. The fourth resolver argumentResolveInfo
can be used to know which fields a client selected.There are some pitfalls to this which require discussion:
* Fields may not match database columns 1-1 * Laravel assumes `SELECT *` by default for things like relations, virtual properties, etc.
Would you be so awesome as to provide an example? I've searched through the docs but I can't seem to find what I am looking for :-(
Let's say I have a model Nerd
with the fields glasses
and beard
,
type Nerd {
glasses: Int!
beard: String!
}
and I want to query them with
type Query {
nerds: [Nerd] @all
}
and I only want the SQL statement to be SELECT glasses, beard FROM nerds
. Thanks in advance!
public function __invoke($root, array $args, GraphQLContext $context, ResolveInfo $resolveInfo)
{
$resolveInfo->getFieldSelection();
$resolveInfo->lookAhead();
// TODO use the information from those methods to SELECT only the given fields
}
public function __invoke($root, array $args, GraphQLContext $context, ResolveInfo $resolveInfo) { $resolveInfo->getFieldSelection(); $resolveInfo->lookAhead(); // TODO use the information from those methods to SELECT only the given fields }
Thanks! Just one more stupid question: where do I put that method? On the Eloquent model?
https://lighthouse-php.com/master/the-basics/fields.html
Awesome, thanks!
This kind of feature would be a huge performance improvement for the package. I'm implementing an endpoint for multi million record table with at least 50+ columns. I'm using query builder to make search in a given bounding box (https://wiki.openstreetmap.org/wiki/Bounding_Box) in PostgreSQL also implemented complex where conditions to filter the results. Only issue is Laravel generates SELECT * queries which affects the performance. I could use custom resolvers but when I do that I have to deal with pagination.
Following is my current Query schema:
listings(bbox: BoundingBoxInput! @builder(method: "App\\GraphQL\\Builders\\BoundingBoxSearch@bbSearch"), where: _ @whereConditions(columnsEnum: "ListingColumn")) : [Listing!]! @middleware(checks: ["auth:api"]) @paginate(maxCount: 50 defaultCount: 10)
Is there any way I can modify select columns without writing custom resolver ? Because by this way with very little code I can do so much.
I am not planning to implement such functionality in Lighthouse myself, but am open to accepting PRs.
I am not planning to implement such functionality in Lighthouse myself, but am open to accepting PRs.
I understand the complexity and side affects like you mentioned above about relations and virtual properties. It would be nice to have some way to control select columns like in query builder @builder directive. Thanks, even like this Lighthouse is very useful.
Just noticed this is a duplicate of https://github.com/nuwave/lighthouse/issues/1214. An implementation attempt was also started https://github.com/nuwave/lighthouse/pull/1253
Can't quite see how this is duplicate.
Could this be solved by implementing a select
directive?
class User extends Model
{
public function getNameAttribute()
{
return $this->first_name . ' ' . $this->last_name;
}
}
type User {
id: ID!
first_name: String
last_name: String
name: String @select(fields: ["first_name", "last_name"])
}
@andershagbard right, there is some overlap between the issue, but the intent is different.
Could this be solved by implementing a select directive?
That would be one part of the puzzle, good thinking.
We might need a mechanism to make the optimized selects opt-in per field / type. Otherwise, it could be quite hard to migrate existing applications towards it, since now Lighthouse always selects all the fields.
We might need a mechanism to make the optimized selects opt-in per field / type. Otherwise, it could be quite hard to migrate existing applications towards it, since now Lighthouse always selects all the fields.
Trying to think of a good solution for this.
Could this be an opt-in in the config file? If enabled it would use the field(s) in the @select, if @select is absent, use the field name.
We might do that as an MVP. I am a bit worried about migrating big applications towards optimized selects, but maybe it is not a big issue in practice.
I do not plan to implement such a feature myself, but am happy to aid in reviewing and refining a PR.
@spawnia I am working on an implementation for this, and I am having trouble accessing child directives via $resolveInfo
. Accessing the directives property on any selection in a selectionSet returns an empty NodeList.
@spawnia I am working on an implementation for this, and I am having trouble accessing child directives via
$resolveInfo
. Accessing the directives property on any selection in a selectionSet returns an empty NodeList.
That would give you access to client directives.
This feature requires looking at schema directives. Look at other Lighthouse directives to see how that works.
Could you give me an example of a parent directive searching through child directives? I can't think of any off the top of my head.
I do not know what you mean by parent or child directives. Consider the FieldValue
class and its uses, maybe you will find what you need there.
Try posting a draft PR and we can discuss implementation details.
I got to the point that I have created at trait that get the fields that should get selected but I am facing a serious issue with relations I can't apply the select method to belongsTo and hasOne and HasMany directives for some reason is there's any easier way to make the select rule more generic and we can apply it to all directives cause I don't see a way in here
Hi all! other thing that I notice is about @count
and @paginate
:
if in the @count
directive we add the columns to include in the count()
method, you can choose some "indexed fields", this improve the performance
directive @count(
"""
.....
"""
"""
Columns
"""
columns: [String!]
) on FIELD_DEFINITION
GRAPHQL;
something similar for @paginate
might improve the performance, but in this case it's more complicated:
https://github.com/nuwave/lighthouse/blob/master/src/Pagination/PaginationArgs.php#L107
if we modify the below line to include the columns for the count process, the column
parameter only affects the return columns and doesn't affect the count that execute the paginator.
return $builder->{$methodName}($this->first, ['row_id'], 'page', $this->page);
This is because the Laravel paginate
method don't use the columns in the count execution. This is more a question for the Laravel issue queue, but maybe add another parameter in the paginate
method to modify the columns in the count process
https://github.com/laravel/framework/blob/8.x/src/Illuminate/Database/Eloquent/Builder.php#L808 https://github.com/laravel/framework/blob/8.x/src/Illuminate/Database/Query/Builder.php#L2445
Update: I opened a discussion about the paginate method https://github.com/laravel/framework/discussions/40114
@dmouse that's just a misconception count(*)
is not the same as SELECT *
, check this link for more details: https://www.sqlservercentral.com/articles/advice-on-using-count
public function __invoke($root, array $args, GraphQLContext $context, ResolveInfo $resolveInfo) { $resolveInfo->getFieldSelection(); $resolveInfo->lookAhead(); // TODO use the information from those methods to SELECT only the given fields }
Trying to do something similar and I found that brilliant piece you written here!! Though what is the lookahead for? Any issues with using the following?
$resolveInfo->getFieldSelection(100);
It gives the result in the most beautiful output I could have imagined to get, not sure about performance issues...
$resolveInfo->getFieldSelection(100);
That is absolutely fine, if it has the information you need. lookAhead()
gives you more detailed information about the selection, such as the types of the sub-fields and their arguments.
$resolveInfo->getFieldSelection(100);
That is absolutely fine, if it has the information you need.
lookAhead()
gives you more detailed information about the selection, such as the types of the sub-fields and their arguments.
Thank you so much for explaining that, so now I know how to access arguments of sub-fields thanks to that method! Just two questions to make my knowledge a little more about this,
- can we grab info from a parent-field in the tree from a sub-node?
$_, array $args, GraphQLContext $context, ResolveInfo $resolveInfo
$_ here gives the parent result, but can't access for instance parent args or args from parent of the parent...
- can we pass in data from upper-fields nodes to lower ones?
Let's say that I have an array which I got based on the info I was provided in an upper-field now in a sub node (in the same branch) I want to access those data, how could I pass data from upper node to lower nodes in the branch? I thought context was there for this but I was wrong!
I think those questions are off-topic, you may ask in StackOverflow or Slack.
Right now, two pull requests are trying to solve this issue:
- https://github.com/nuwave/lighthouse/pull/1626 is following a relatively simple approach but lacks tests
- https://github.com/nuwave/lighthouse/pull/2235 tries to do more magic, which I believe to be misguided
Going forward, I think the minimal viable solution for this issue is to implement the manual approach first. A marker directive needs to be added to fields to inform Lighthouse which columns need to be selected for it, e.g.:
"""
Defines which columns must be selected from the database to resolve the field.
If specified on at least one field of a type, Lighthouse will `SELECT` only the explicitly
mentioned columns matching the queried fields. For fields without this directive,
no columns will be selected, so make sure to specify this for all fields that need it.
"""
directive @select(columns: [String!]!) on FIELD_DEFINITION
type User {
id: ID! @select(columns: ["id"])
first_name: String! @select(columns: ["first_name"])
last_name: String! @select(columns: ["last_name"])
full_name: String! @select(columns: ["first_name", "last_name"])
rights: [Right!]! @hasMany
}
Only if the fields of a type are explicitly marked with @select
will Lighthouse replace the default SELECT *
with a constrained SELECT
, based upon the fields selected in the query.
{ users { id last_name full_name } }
# SELECT id, last_name, first_name
@spawnia I agree on your approach on it being an entirely opt-in feature through the @select
.
I've been following this for a while as I have one table that has a couple of large blob columns (not the best design but it's a legacy project), without constraining the SELECT
the performance is terrible. I've had to fix this through a builder that modifies the query, but some way of doing it as you've shown would be great.