octokit.net
octokit.net copied to clipboard
Problems paging and querying milestones
I'm working on a simple app that will query all bugs for a given milestone using Octokit. The target repo is initially dotnet/roslyn which has a very high number of issues, even when sorted by milestone. As such I'm trying to use pagination in the API to get the issues in chunks and possibly break up the work.
I've looked through the available docs and dug through the code a bit but I can't seem to get a pagination working here. The basic code I'm running is the follownig:
var client = CreatGitHubClient(); // GitHubClient with authentication token
var request = new RepositoryIssueRequest()
{
Milestone = "15";
};
var options = new ApiOptions()
{
PageCount = 1,
StartPage = 0,
PageSize = 50
};
var issues = client.Issue;
var pageIssues = await issues.GetAllForRepository(repo.Id, request, options);
Given I'm asking for max 50 issues I would expect this to return fairly quickly. Yet after several minutes it hasn't returned and I can see the GC churning at a pretty good rate. Seems like it's trying to process all issues here (800+ open issues) vs. filtering on the options.
Hopefully I'm just making a simple mistake here that's easy enough to fix. Would appreciate help in figuring out what that mistake is :smile:
Experimentally I've found that i'm actually hurting myself here with ApiOptions. If I forgo pagination and use the overload without ApiOptions the same call completes in just a few seconds.
Am I just using ApiOptions in some unsupported way here?
@jaredpar I think that this behaviour due to StartPage = 0. Did you try change StartPage to 1?
@dampir good call. Switching StartPage to 1 fixed the issue.
In general though is this the correct way to approach queries where a large number of issues are going to be returned? Or if I'm going to process them all in batch anyways should I just query without any pagination?
@jaredpar yeah, silly 1-based pagination 😞 - let me know if there's any way we can make this clearer.
If you drop the pagination overload the GitHub API will return 30 entities per response, and when it finds Link Headers in the response headers it'll continue to make new requests.
If you're not really looking for a specific page of results, and just want stuff faster than the defaults, I'd just set the PageSize:
// fetch all items, 100 at a time
var batchPagination = new ApiOptions
{
PageSize = 100
};
@shiftkey once you know it's 1 based it's pretty easy to use.
I think the only way to make it clearer is to error in some way in the case of a 0 value. It seems like that would be possible given the structure of ApiOptions. But I've only used this API in a pretty limited fashion, so it's possible I'm not considering valid use cases.
If you're not really looking for a specific page of results, and just want stuff faster than the defaults, I'd just set the PageSize:
I was mostly trying to be conscious of now downloading the entire dotnet/roslyn issue database in a single go 😄. Paging seemed to be the best way to achieve that.
👋 Hey Friends, this issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Please add the Status: Pinned label if you feel that this issue needs to remain open/active. Thank you for your contributions and help in keeping things tidy!