efcore
efcore copied to clipboard
Cosmos: add support for pagination
See https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.cosmos.queryrequestoptions.maxitemcount?view=azure-dotnet
Also consider exposing the continuation token
The RU charge of a query with OFFSET LIMIT will increase as the number of terms being offset increases. For queries that have multiple pages of results, we typically recommend using continuation tokens. Continuation tokens are a "bookmark" for the place where the query can later resume. If you use OFFSET LIMIT, there is no "bookmark". If you wanted to return the query's next page, you would have to start from the beginning.
/cc @roji
It seems like we should have both a per-query setting and a global context option default (like non-tracked).
Some theoretical thoughts:
- This is related to keyset pagination in the sense that this is also about avoiding OFFSET pagination.
- The rough relational feature comparable to this is cursors, i.e. execute a query via a cursor (
DECLARE CURSOR FOR ...
) and then fetching the next page each time (FETCH NEXT ...
). In both cases we're executing a roundtrip every time to get the next page. It's interesting that this logic is the default in Cosmos but not in relational. - We could provide a relational feature where the query is paginated: the application would continue interacting with a simple QueryingEnumerable, and EF would send
FETCH NEXT
as needed under the hood. - However, I'm not sure how that's better than the application implementing explicit keyset navigation itself, i.e. using OrderBy and Where to precisely define the page they want to fetch.
- Explicit keyset pagination has the advantage of being able to move backwards (whereas a LINQ Enumerable can only move forward).
- Cursors would have to be closed when the enumerable is disposed. This means the consuming application (e.g. GraphQL) must hold state during the pagination, which doesn't fit well with the disconnected model. Explicit keyset navigation doesn't require this - you just need to provide data on the row fetched in the last page, and go from there.
- There may be some perf advantages in using cursors (same query being executed rather than multiple ones), but it isn't very clear at this point.
- Some differences between relational cursors and Cosmos continuation tokens:
- A Cosmos continuation token doesn't hold up any server-side resources, whereas a relational cursor must be closed.
- Cosmos is free to return less than MaxItemCount, so if the goal is to definitely return a page of 10, the application is responsible for refetching (EF in our scenario).
Very interested in this - currently using EFCore + Cosmos provider in a Blazor Server application and there isn't a very scalable way I can find to offer paging without being able to use the continuation token
Very interested in this as well. We are using EF Core Cosmos provider at the moment and the only way to query a page is to use the 'skip and take' approach. But as it has been mentioned, it's going to use more RUs for the next pages.
So we started thinking about using .NET SDK and replacing EF queries with Cosmos continuation tokens
Any updates? I am also having to shelve EF w/ Cosmos because of this limitation.
This feature hasn't been assigned to a particular release yet. The best way to indicate the importance of an issue is to vote (👍) for it. This data will then feed into the planning process for the next release.
Current design proposal:
public static Task<Page<TSource>> ToPageAsync<TSource>(
this IQueryable<TSource> source,
string? continuationToken = null,
int? maxItemCount = null,
int? continuationTokenLimitInKb = null,
CancellationToken cancellationToken = default)
public static IQueryable<TSource> WithMaxItemCount<TSource>(
this IQueryable<TSource> source,
int maxItemCount)
See Page<T> Aslo see Pagination with the Azure SDK
Open questions:
- Naming
Thanks @AndriySvyryd, was not aware of this API!
Note the conceptually somewhat similar LINQ Chunk API which was recently added. Of course, that has no notion of a continuation token, so is less useful.
BTW we could even consider providing some sort of similar "pageability" on relational databases via keyset pagination, where the sorting key(s) would be somehow encoded as the continuation token which the user can extract and pass back.
Updated the proposal based on the discussion. If we implement pagination on other providers the API will be separate, even if the general shape is the same, to allow exposing provider-specific options.
BTW I think it does make sense to accept the maxItemCount directly in ToPageAsync, just to not force the user to use two different operators (it could override any previous-specified one). But we can figure all that out later.
It would be good if you used the same patter, or even the same pagination API as in the Azure SDK (or System.ClientModel).
@KrzysztofCwalina yeah, we haven't yet started designing for this, but these are on my radar. Are you refering specifically to these recently-merged paging abstractions, or anything else?
@roji, we are iterating to finalize those abstractions now in advance of the next GA of System.ClientModel. If those or the similar types in Azure.Core don't meet your requirements but you'd be open to using them if they did, it would be helpful to us to understand your requirements so we could consider including them in the SCM abstractions with a goal to minimize API surface for users.
Sounds good! I'll try to prioritize working on the pagination bits sooner rather than later, and will be in touch with the results. Who are the relevant people to ping on both System.ClientModel and the Azure SDKs, apart from you two?
Who are the relevant people to ping on both System.ClientModel and the Azure SDKs, apart from you two?
@KrzysztofCwalina and I are a great place to start and we can loop in anyone else whose perspective we need. Many thanks!