graphql-platform
graphql-platform copied to clipboard
Overfetching problem when querying related entities and using intermediary projections
Following https://github.com/ChilliCream/hotchocolate/issues/2365#issuecomment-697727508
I have Book and Author domain classes + BookDto and AuthorDto classes, I want to expose an IQueryable<BookDto> for the books field on my query root type.
So here is the GetBooks method on the Query class: It simply projects Book into BookDto and returns an IQueryable<BookDto> so that Hot Chocolate could then do its magic on it.
[UseSelection]
public IQueryable<BookDto> GetBooks([Service]AppDbContext dbContext)
{
return dbContext.Books.Select(book => new BookDto
{
Name = book.Name,
...
Author = new AuthorDto
{
Id = book.Author.Id,
Name = book.Author.Name,
}
});
}
Now, when you run the following query:
{
books {
name
author {
name
}
}
}
You would probably expect the following SQL to ultimately get generated:
SELECT [b].[Name], [a].[Name]
FROM [Books] AS [b]
INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]
However, instead, Hot Chocolate causes this one to get generated by EF Core:
SELECT [b].[Name], [a].[Id], [a].[Name], [a].[Age]
FROM [Books] AS [b]
INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]
Which retrieves all the fields on the author (including Age and Id in this case), even though the query requested only the Name.
At first, I thought this must be an EF Core issue, but then I executed the following code:
var queryableDto = dbContext.Books.Select(book => new BookDto
{
Name = book.Name,
Author = new AuthorDto
{
Id = book.Author.Id,
Name = book.Author.Name,
Age = book.Author.Age
}
});
// The code that you would expect Hot Chocolate will generate for the above GraphQL query:
queryableDto.Select(bDto => new
{
Name = bDto.Name,
Author = new
{
Name = bDto.Author.Name,
}
}).ToList();
And I realized it works as expected, and generates the following SQL which only selects the requested column (author name in this case):
SELECT [b].[Name], [a].[Name]
FROM [Books] AS [b]
INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]
Therefore, this must be related to Hot Chocolate, I believe.
This is really a serious problem for situations like mine when we don't want to expose IQueryable<TheEntityFrameworkModelClass> but rather a different type.
I'm using Hot Chocolate version 10.5.2 And I've tried this with both EF Core version 3.1.8 and 5.0.0 RC.
i will track this down in the new implementation for 11.
It should just work, because we prjoect like this:
{
books {
name
author {
name
}
}
}
.Select(x=> new BookDto
{
Name = x.Name,
Author = new AuthorDto
{
Name = x.Author.Name,
}
})
@PascalSenn Ok, great to hear. Then we'll have to wait for v11 before we can deploy our app so that this issue is fixed until then. Thanks.
It should just work, because we prjoect like this:
Well, as I mentioned here, running the code that you said that Hot Chocolate would generate from the GraphQL query:
queryableDto.Select(bDto => new
{
Name = bDto.Name,
Author = new
{
Name = bDto.Author.Name,
}
}).ToList();
would NOT generate the SQL that Hot Chocolate causes to be generated. I talked about this in the original comment and also here. So I think that's probably not exactly how the projection works in Hot Chocolate. And I feel like this might be a little mistake or something somewhere, it's probably not that big of a deal on your side, I don't know, I'm just guessing! But it is, however, a very big deal for our side! Since all the data of a related entity gets fetched unnecessarily when you request even just one column. So, I'd appreciate it if you look into this in the coming days! Thank you so much.
Does this query generate the desired result?
queryableDto.Select(x=> new BookDto
{
Name = x.Name,
Author = new AuthorDto
{
Name = x.Author.Name,
}
}).ToList();
@PascalSenn Yes, that query would generate the following SQL:
SELECT [b].[Name], [a].[Name]
FROM [Books] AS [b]
INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]
which is perfect. It only retrieves the name of the book and the name of the author, as requested.
However, sending this GraphQL query to our Hot Chocolate server:
{
books {
name
author {
name
}
}
}
would generate this SQL instead:
SELECT [b].[Name], [a].[Id], [a].[Name], [a].[Age]
FROM [Books] AS [b]
INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]
which as you can see is unnecessarily fetching the Id and Age of the author, while they were not queried.
I can also create a public GitHub repo and push this project into it so you can clone it and look into it and into the generated queries yourself. A super-simple project with less than 15 classes in total.
Yes a repro would be great
Are there any custom resolvers on author?
@PascalSenn Sure! I'll do it and let you know. And no, nothing, it's a just class with a handful of properties. I've added nothing HotChocolate-specific to it at all. I'll create the repo shortly ;)
@PascalSenn Here you go, sir: https://github.com/AradAral/AspNetCoreGraphQL
Edit: I added some seed data for the database so if you want to clone the project you can just adjust the connection string and then run one initial migration, as you know, and then you can use the GraphQL endpoint and inspect the generated SQL queries. I also added a MyController, which manually does the querying; you can check the query that that generates too.
Thanks in advance.
@PascalSenn: :)

@AradAral cool, the repro is definitely helpful. Checked the configuration didn't see anything obvious wrong with it.
@PascalSenn I just noticed something else which I think might be important:
I removed the UseSelection middleware from the GetBooks method, but it still gave me the author data if I queried it, which normally wouldn't happen unless you use the UseSelection middleware.
Do you think that can tell us anything about the issue?
Did you also remove it from the resolver declared in the QueryType?
It is probably always requested in this case, because Select statement that maps from object to dto projects it already
Yes, I actually removed the whole QueryType class and it's now just the Query class, without the UseSelection attribute on top of the resolver.
Now that the UseSelection middleware is not present irrespective of what field I query everything (including the author data) is retrieved.
@PascalSenn
I think I just found out what's causing the problem:
By building an expression visitor and inspecting the expression that Hot Chocolate passes to the Select method, I just realized that it (the expression Hot Chocolate generates) looks like this:
e => new BookDto()
{
Name = e.Name,
Author = e.Author == null ? default(AuthorDto) : new AuthorDto() { Name = e.Author.Name }
}
The important bit is where it's checking whether Author is null or not (e.Author == null ? ...), that, as it turns out, is exactly what's causing the whole author object to be fetched from the database. This works properly with EF Core domain classes, in other words, if e.Author was an entity class, but not with DTO classes like this.
So, now, what you do think can be done about this? Does Hot Chocolate allow us to disable null checks so that it leaves it up to me to check for nulls, or perhaps do the null check differently so as to avoid this, or really anything that could solve this problem somehow? Thank you in advance!
Based on the analysis above, I'll close this issue.
For anyone curious, I eventually ended up writing a custom expression visitor that replaces any occurrence of e.Author == null that Hot Chocolate generates to e.Author.Id == null, so as to avoid the aforementioned over-fetching problem.
@PascalSenn Hi! I didn't know that, what change did you make to fix this? I thought there was nothing you could do about it.
Sorry i messed something up. This could still be an issue
we just fixed UseProjections to do this:
Author = e.Author == null ? default(AuthorDto) : new AuthorDto() { Name = e.Author.Name }
I need to check what it does with reprojection
@PascalSenn
But isn't Author = e.Author == null ? default(AuthorDto)... what HC already did in the previous versions? What has changed exactly?
Yes, but UseProjection did something different. It returned null values. I was thinking this issue was related to that, then i read more into it :) sorry for the messup
@PascalSenn Aha! Okay, got it! No problem ;)
It looks like there's still an issue on deeply nested entities.
[UseProjection]
public IQueryable<BookDto> GetBooks([Service]AppDbContext dbContext)
{
return dbContext.Books.Select(book => new BookDto
{
Name = book.Name,
...
Author = new AuthorDto
{
Id = book.Author.Id,
Name = book.Author.Name,
Address = new AddressDto
{
Street = book.Author.Address.Street
}
}
});
}
where the Address is still joined when we only query like so
{
books {
name
author {
name
}
}
}
@gojanpaolo Hi!
It's actually not just about deeply nested objects. Whenever you have something like [Something] = new [Something]Dto { } in your projection that you then return in your resolvers, even one field getting requested by the GraphQL client on that [Something]Dto causes the whole object to be retrieved from the database. I explained why that happens here.
The reason why in your case Address is being retrieved is because, again, you're querying Author.Name and consequently the whole Author object is getting retrieved, which means everything inside of it is getting retrieved, including every nested object, like Address in your case.
I was expecting the projection to behave similar to
var dto =
dbContext.Books.Select(book => new BookDto
{
Name = book.Name,
...
Author = new AuthorDto
{
Id = book.Author.Id,
Name = book.Author.Name,
Address = new AddressDto
{
Street = book.Author.Address.Street
}
}
});
var result =
dto.Select(bookDto => new
{
Name = bookDto.Name,
Author = new
{
Name = bookDto.Author.Name
}
})
.ToList();
which EF correctly translates to the SQL query I expected, i.e. no join on the Address.
I believe this is something we can fix either in hotchocolate/UseProjection itself, or in my graphql layer code.
Im also experiencing overfetching:
the query
users { id }
works fine and produces the following SQL
SELECT [u].[Id]
FROM [User] AS [u]
whereas this query
users {
id
author { name }
}
unecessary loads more data:
SELECT [u].[Id], [a].[Id], [a].[Name], [a].[UserId], [b].[Id], [b].[AuthorId], [b].[Title]
FROM [User] AS [u]
LEFT JOIN [Author] AS [a] ON [u].[Id] = [a].[UserId]
LEFT JOIN [Book] AS [b] ON [a].[Id] = [b].[AuthorId]
ORDER BY [u].[Id], [a].[Id], [b].[Id]
[UseProjection]
public IQueryable<User> Users([Service] DataContext ctx)
{
return ctx.User
.Include(v => v.Author)
.ThenInclude(v => v.Books);
}
Using efcore 5 and hot chocolate 11
I see this issue was closed alongside a load of AutoMapper tickets, but this doesn't seem related to AutoMapper? I'm having the same issue here using projections, where all fields are requested in the SQL query, despite only requesting one field of a sub-object
@brickscrap can you share your resolver and query?
Ignore me - despite trying to figure it out for hours before commenting here, 10 minutes after posting I found the problem. I had UseProjection enabled on the parent object, and this seemed to be causing issues. Removed it, and it's all working as expected now.
Sorry for bothering you
I am getting the same issue where it is fetching all the columns in child instead of what is selected.
query (im using serial execution)
public class Query
{
[UseProjection]
public IQueryable<PersonDto> GetPersons([Service] SomeDbContext dbContext)
{
return dbContext.Persons.Select(person => new PersonDto
{
Id = person.Id,
FirstName = person.FirstName,
LastName = person.LastName,
Guardian = person.Guardian != null ? new GuardianDto
{
Id = person.Guardian.Id,
FirstName = person.Guardian.FirstName,
LastName = person.Guardian.LastName
} : default
});
}
}
dtos (objects)
public class PersonDto
{
public int Id { get; set; }
public string FirstName { get; set; } = null!;
public string LastName { get; set; } = null!;
public GuardianDto? Guardian { get; set; } = null!;
}
public class GuardianDto
{
public int Id { get; set; }
public string FirstName { get; set; } = null!;
public string LastName { get; set; } = null!;
}
selections
query {
persons {
id
firstName
guardian {
id
firstName
}
}
}
sql
SELECT "p"."Id", "p"."FirstName", "g"."Id" IS NOT NULL, "g"."Id", "g"."FirstName", "g"."LastName"
FROM "Persons" AS "p"
LEFT JOIN "Guardians" AS "g" ON "p"."GuardianId" = "g"."Id"
here's a repo if you need the fullest context: https://github.com/nollidnosnhoj/testchocolate
EDIT: another thing interesting is when i did this with a collection of children dtos, it projected correctly.
For anyone still having this problem, I explained the reason why it's happening in this comment.
I think since this pattern of having "intermediary projections" (like DTOs) is rather common when using Hot Chocolate along with Entity Framework, it would make sense if Hot Chocolate provided a configuration setting or something on its projection engine to allow the developer to choose how the existence or non-existence of a single related entity is determined in the projection.
So let me reopen this issue: