sparql.anything icon indicating copy to clipboard operation
sparql.anything copied to clipboard

Support cursor based paginated JSON APIs

Open lolgab opened this issue 1 year ago • 3 comments

Some paginated APIs use cursors to allow users to iterate on their results. The first page returns the data with a cursor which you can use to build the URL for the next page (or an URL itself). https://jsonapi.org/profiles/ethanresnick/cursor-pagination/

I imagine supporting something like this generically is very hard thought.

Ideally, after telling sparql-anything how pagination works, if an API has a page size of 100 I want to get the first 100 results and only if they are not enough to fit the query request, I want to get another page by looking at the cursor data and interpreting accordingly to the fx:properties. The best way I know to handle this today is to write a script that dumps all the pages to a JSON file first, and then query that JSON file instead of the real JSON API.

Example:

https://api.codacy.com/api/v3/tools/f8b29663-2cb2-498d-b923-a10c6a8c05cd/patterns

This API returns 100 results in the data field and a pagination field which contains a cursor string: If you now pass the cursor as a query parameter:

https://api.codacy.com/api/v3/tools/f8b29663-2cb2-498d-b923-a10c6a8c05cd/patterns?cursor=MTAw

you get another page and so on.

When the cursor is not defined this means there are no more results to visit.

I'm not sure it's possible to abstract over so many details though.

lolgab avatar Oct 13 '22 10:10 lolgab

@lolgab I've handled pagination in a single SPARQL query before: https://github.com/justin2004/weblog/tree/master/dynamic_pagination_with_sparql_anything

That approach might be useful to you.

justin2004 avatar Oct 13 '22 19:10 justin2004

@lolgab I've handled pagination in the past on APIs that used a param with incremental numbers, e.g. pagesize + page. See this showcase for a full example: https://github.com/SPARQL-Anything/showcase-minter

However, the cursor pattern seems interesting, do you have a sense of how much use there is of this in other APIs? I ask because the behaviour required sounds a bit esoteric (get a value from the response and use it for a subsequent request) but if it's a common situation maybe we can support that.

enridaga avatar Oct 25 '22 08:10 enridaga

@justin2004 Thank you for the example. My example is different since the cursor is generated by the API provider and I can't calculate it without iterating over the pages.

@enridaga I didn't research how much cursor pagination is widespread. For sure it requires special handling for it to work with sparql.anything.

lolgab avatar Oct 25 '22 10:10 lolgab