octokit.rb
octokit.rb copied to clipboard
Suggested Pagination Traversal Strategy results in infinite loop
The suggested pagination traversal strategy is not working. It results in an infinite loop.
@client = Octokit::Client.new(:access_token => ACCESS_TOKEN)
user_response = @client.user
username = user_response[:login]
events_response = @client.user_events(username, :per_page => 100)
page_number = 1
latest_event_created_at = events_response.last[:created_at]
puts "PAGE #{page_number} -- EVENTS SINCE #{latest_event_created_at}"
while @client.last_response.rels[:next] # while there is a next page to get ...
events_response = @client.last_response.rels[:next].get.data # get the next page
page_number += 1
latest_event_created_at = events_response.last[:created_at]
puts "PAGE #{page_number} -- EVENTS SINCE #{latest_event_created_at}"
end
=>
PAGE 1 -- EVENTS SINCE 2016-02-28 23:02:38 UTC
PAGE 2 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 3 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 4 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 5 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 6 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 7 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 8 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 9 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 10 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 11 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 12 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 13 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 14 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
... etc
I think the issue is happening because the call to get the next page of data (@client.last_response.rels[:next].get.data
) does not count as a request and resulting response known to the @client.last_response
method. Let me know if there is something I'm missing or misunderstanding.
For googlers looking for a way to traverse the pagination, try the following:
@client = Octokit::Client.new(:access_token => ACCESS_TOKEN)
user_response = @client.user
username = user_response[:login]
page_number = 1
events_response = @client.user_events(username, :page => page_number, :per_page => 100)
latest_event_created_at = events_response.last[:created_at]
puts "PAGE #{page_number} -- EVENTS SINCE #{latest_event_created_at}"
while latest_event_created_at > 1.year.ago
page_number += 1
begin
events_response = @client.user_events(username, :page => page_number, :per_page => 100)
rescue Octokit::UnprocessableEntity => e
exit if e.message.include?("pagination is limited for this resource")
end
latest_event_created_at = events_response.last[:created_at]
puts "PAGE #{page_number} -- EVENTS SINCE #{latest_event_created_at}"
end
=>
PAGE 1 -- EVENTS SINCE 2016-02-28 23:02:38 UTC
PAGE 2 -- EVENTS SINCE 2016-02-13 20:04:08 UTC
PAGE 3 -- EVENTS SINCE 2016-01-29 03:31:03 UTC
... however, note there is still a restriction on the number of pages of events returned by the service.
Octokit::Connection.paginate()
accepts a block, which is yields data for each page:
def paginate(url, options = {}, &block)
(...)
yield(data, @last_response)
(...)
end
In order to pass a block when calling user_events
, it would need to be passed explicitly to paginate()
:
def user_events(user, options = {})
# original:
# paginate "#{User.path user}/events", options
paginate "#{User.path user}/events", options, &Proc.new
end
I guess adding &Proc.new
as an argument for paginate calls would make dealing with pagination much easier:
@client.user_events(...) do |data, last_response|
# 100 events here...
puts last_response.data
}
Please correct me if I am wrong.
Does this paginate
method sound like a good candidate for this section? http://octokit.github.io/octokit.rb/#Pagination
I ended up looking into the internal autopaginate code to figure out how pagination was happening, because the sample code never got past page 2. It turns out that Octokit.last_response is only set if you call a lib method directly, but not when following rels.
# From docs. This will work...
issues = Octokit.issues 'rails/rails', :per_page => 100
issues.concat Octokit.last_response.rels[:next].get.data
# ...but the following will give you the second page again
issues.concat Octokit.last_response.rels[:next].get.data
# Instead you have to keep track of the last response yourself
issues = Octokit.issues 'rails/rails', :per_page => 100
last_response = Octokit.last_response
while true do
# ... process the last_response.data ...
break unless last_response.rels[:next]
last_response = last_response.rels[:next].get
end
Also, if you set a particular header in the original request for pagination, e.g. Accept: application/vnd.github.v3.star+json
it doesn't seem to get respected when following the relation, and has to be passed in again, but this might be a separate issue.
In the end, if the paginate
method is the way to go, I'd be happy to help update the docs =)
Thanks, everyone for the detailed discussions. We'll work on getting this documented, and also for the next major release we have plans for improving the pagination user interface.
I stumbled across this while trying to work out why I couldn't get a list of all repositories even when I set per_page: 500
(is the limit of 100 documented?).
In the example above from @xevix I believe that the break
needs to be after the new version of last_response
is set, so it should be:
issues = Octokit.issues 'rails/rails', :per_page => 100
last_response = Octokit.last_response
while true do
# ... process the last_response.data ...
last_response = last_response.rels[:next].get
break unless last_response.rels[:next]
end
I am actually using github.organization_repositories
instead of Octokit.issues
, where github
is an instance of Octokit::Client
, but I wouldn't think that this should make a difference.
As the break is now at the end, the loop can be further simplified and my version looks like:
all_repos = github.organization_repositories 'rails'
last_response = github.last_response
while last_response.rels[:next] do
all_repos.concat last_response.rels[:next].get.data
last_response = last_response.rels[:next].get
end
1
👋 Hey Friends, this issue has been automatically marked as stale
because it has no recent activity. It will be closed if no further activity occurs. Please add the Status: Pinned
label if you feel that this issue needs to remain open/active. Thank you for your contributions and help in keeping things tidy!