efcore icon indicating copy to clipboard operation
efcore copied to clipboard

Cosmos: add support for pagination

Open AndriySvyryd opened this issue 3 years ago • 18 comments

See https://docs.microsoft.com/en-us/dotnet/api/microsoft.azure.cosmos.queryrequestoptions.maxitemcount?view=azure-dotnet

AndriySvyryd avatar Mar 25 '21 19:03 AndriySvyryd

Also consider exposing the continuation token

The RU charge of a query with OFFSET LIMIT will increase as the number of terms being offset increases. For queries that have multiple pages of results, we typically recommend using continuation tokens. Continuation tokens are a "bookmark" for the place where the query can later resume. If you use OFFSET LIMIT, there is no "bookmark". If you wanted to return the query's next page, you would have to start from the beginning.

AndriySvyryd avatar Jul 12 '21 18:07 AndriySvyryd

/cc @roji

ajcvickers avatar Oct 27 '21 18:10 ajcvickers

It seems like we should have both a per-query setting and a global context option default (like non-tracked).

Some theoretical thoughts:

  • This is related to keyset pagination in the sense that this is also about avoiding OFFSET pagination.
  • The rough relational feature comparable to this is cursors, i.e. execute a query via a cursor (DECLARE CURSOR FOR ...) and then fetching the next page each time (FETCH NEXT ...). In both cases we're executing a roundtrip every time to get the next page. It's interesting that this logic is the default in Cosmos but not in relational.
  • We could provide a relational feature where the query is paginated: the application would continue interacting with a simple QueryingEnumerable, and EF would send FETCH NEXT as needed under the hood.
  • However, I'm not sure how that's better than the application implementing explicit keyset navigation itself, i.e. using OrderBy and Where to precisely define the page they want to fetch.
    • Explicit keyset pagination has the advantage of being able to move backwards (whereas a LINQ Enumerable can only move forward).
    • Cursors would have to be closed when the enumerable is disposed. This means the consuming application (e.g. GraphQL) must hold state during the pagination, which doesn't fit well with the disconnected model. Explicit keyset navigation doesn't require this - you just need to provide data on the row fetched in the last page, and go from there.
    • There may be some perf advantages in using cursors (same query being executed rather than multiple ones), but it isn't very clear at this point.
  • Some differences between relational cursors and Cosmos continuation tokens:
    • A Cosmos continuation token doesn't hold up any server-side resources, whereas a relational cursor must be closed.
    • Cosmos is free to return less than MaxItemCount, so if the goal is to definitely return a page of 10, the application is responsible for refetching (EF in our scenario).

roji avatar Oct 28 '21 10:10 roji

Very interested in this - currently using EFCore + Cosmos provider in a Blazor Server application and there isn't a very scalable way I can find to offer paging without being able to use the continuation token

jhulbertpmn avatar Jan 30 '23 17:01 jhulbertpmn

Very interested in this as well. We are using EF Core Cosmos provider at the moment and the only way to query a page is to use the 'skip and take' approach. But as it has been mentioned, it's going to use more RUs for the next pages.

So we started thinking about using .NET SDK and replacing EF queries with Cosmos continuation tokens

vyarymovych avatar Feb 27 '23 15:02 vyarymovych

Any updates? I am also having to shelve EF w/ Cosmos because of this limitation.

brandonsmith86 avatar Oct 17 '23 21:10 brandonsmith86

This feature hasn't been assigned to a particular release yet. The best way to indicate the importance of an issue is to vote (👍) for it. This data will then feed into the planning process for the next release.

AndriySvyryd avatar Oct 17 '23 21:10 AndriySvyryd

Current design proposal:

    public static Task<Page<TSource>> ToPageAsync<TSource>(
        this IQueryable<TSource> source,
        string? continuationToken = null,
        int? maxItemCount = null,
        int? continuationTokenLimitInKb = null,
        CancellationToken cancellationToken = default)
 
    public static IQueryable<TSource> WithMaxItemCount<TSource>(
        this IQueryable<TSource> source,
        int maxItemCount)

See Page<T> Aslo see Pagination with the Azure SDK

Open questions:

  • Naming

AndriySvyryd avatar Dec 02 '23 00:12 AndriySvyryd

Thanks @AndriySvyryd, was not aware of this API!

roji avatar Dec 04 '23 11:12 roji

Note the conceptually somewhat similar LINQ Chunk API which was recently added. Of course, that has no notion of a continuation token, so is less useful.

roji avatar Dec 04 '23 11:12 roji

BTW we could even consider providing some sort of similar "pageability" on relational databases via keyset pagination, where the sorting key(s) would be somehow encoded as the continuation token which the user can extract and pass back.

roji avatar Dec 04 '23 11:12 roji

Updated the proposal based on the discussion. If we implement pagination on other providers the API will be separate, even if the general shape is the same, to allow exposing provider-specific options.

AndriySvyryd avatar Dec 04 '23 21:12 AndriySvyryd

BTW I think it does make sense to accept the maxItemCount directly in ToPageAsync, just to not force the user to use two different operators (it could override any previous-specified one). But we can figure all that out later.

roji avatar Dec 04 '23 21:12 roji

It would be good if you used the same patter, or even the same pagination API as in the Azure SDK (or System.ClientModel).

KrzysztofCwalina avatar Jun 05 '24 16:06 KrzysztofCwalina

@KrzysztofCwalina yeah, we haven't yet started designing for this, but these are on my radar. Are you refering specifically to these recently-merged paging abstractions, or anything else?

roji avatar Jun 05 '24 17:06 roji

@roji, we are iterating to finalize those abstractions now in advance of the next GA of System.ClientModel. If those or the similar types in Azure.Core don't meet your requirements but you'd be open to using them if they did, it would be helpful to us to understand your requirements so we could consider including them in the SCM abstractions with a goal to minimize API surface for users.

annelo-msft avatar Jun 05 '24 19:06 annelo-msft

Sounds good! I'll try to prioritize working on the pagination bits sooner rather than later, and will be in touch with the results. Who are the relevant people to ping on both System.ClientModel and the Azure SDKs, apart from you two?

roji avatar Jun 05 '24 20:06 roji

Who are the relevant people to ping on both System.ClientModel and the Azure SDKs, apart from you two?

@KrzysztofCwalina and I are a great place to start and we can loop in anyone else whose perspective we need. Many thanks!

annelo-msft avatar Jun 05 '24 21:06 annelo-msft