grlc
grlc copied to clipboard
fixing response headers of requests
Hi, I have a few questions regarding headers of requests. I am testing with this endpoint: http://grlc.io/api-git/rapw3k/cybele/#/json/get_allDatasets
- If you dont send a page number in the request, the response header returns the link to the next and to the last as follows: next: http://grlc.io,grlc.io/api-git/rapw3k/cybele/allDatasets?endpoint=https%3A%2F%2Fwww.foodie-cloud.org%2Fsparql%3Fdefault-graph-uri%3Dhttps%3A%2F%2Fw3id.org%2Fcybele%2Fdatasets%2F&page=1 last: http://grlc.io,grlc.io/api-git/rapw3k/cybele/allDatasets?endpoint=https%3A%2F%2Fwww.foodie-cloud.org%2Fsparql%3Fdefault-graph-uri%3Dhttps%3A%2F%2Fw3id.org%2Fcybele%2Fdatasets%2F&page=10.0
however, the page=1 gives the same result as without page number, so shouldn't be next page = 2 if i dont send page number ?
Moreover, the links are actually incorrect, they have two times grlc.io in domain, i.e., http://grlc.io,grlc.io/ , and also why 10.0 instead of just 10 ?
-
why the response headers says the last page is page 10, if only the first page returns data in my example? Page 2,3,..10 for example returns zero informatoin: http://grlc.io/api-git/rapw3k/cybele/allDatasets?endpoint=https%3A%2F%2Fwww.foodie-cloud.org%2Fsparql%3Fdefault-graph-uri%3Dhttps%3A%2F%2Fw3id.org%2Fcybele%2Fdatasets%2F&page=2
-
page 10 does not give link to next page which is ok, but you can actually ask for page 11,12,13.....
-
if each page returns 100 results, what will happen if i have more than 100 (x10 pages) =1000 results ? Will this change the last page to page 20 ?
thanks! Raul
Hi @rapw3k,
Thanks for your comments! I don't think this particular functionality has been extensively used, so it is not very polished: there is definitely room for improvement :-)
shouldn't be next page = 2 if i dont send page number ? Yes, I you are right. Probably the "page" variable should be set to 1 if not present in the request.
links are actually incorrect, they have two times grlc.io in domain, i.e., http://grlc.io,grlc.io/ , and also why 10.0 instead of just 10 Not sure why links are being generated like that, but indeed they look in correct.
why the response headers says the last page is page 10 page 10 does not give link to next page what will happen if i have more than 100 (x10 pages) =1000 results
These are all related -- the issue comes from the fact that, because counting results before executing the query is expensive, at the moment grlc just 'guesses' (or "Provides a dummy count for now") there will be 1000 results (ugly hack):
https://github.com/CLARIAH/grlc/blob/d4ddb1530cfef57464a8dc31edddc44c1387fe77/src/gquery.py#L68-L73
Until now, we didn't have a good use case to justify the additional load of querying to pre-calculate the number of results. But if this is functionality that would be useful to you, maybe we've finally got a reason to implement this properly.
Is the paging functionality something you would need for your use case? Are there any particular considerations you think should be taken into account?
@albertmeronyo -- what do you think? Do you know if there are other use cases which would benefit from this functionality?
thanks for the reply @c-martinez Indeed, we are having some cases where we are returning tens of results, and the paging becomes quite relevant in order to get them.
Some problematic points I see, apart from the ones mentioned above:
- As far as I can see, one result may be spread in two pages,
- The number of results per page is a little random. Some pages return 1 result, some other return 5 or 6 or more than 10 .e.g,
https://grlc.io/api-git/cybele-project/metadata/allDatasets_testbed?testbed=https://w3id.org/cybele/datasets/PSNC&page=70
https://grlc.io/api-git/cybele-project/metadata/allDatasets_testbed?testbed=https://w3id.org/cybele/datasets/PSNC&page=72