0pdd Octokit::BadGateway

In production a few hours ago

Octokit::BadGateway: GET https://api.github.com/user/repository_invitations: 502 - Server Error
  from octokit/response/raise_error.rb:16:in `on_complete'
  from faraday/response.rb:9:in `block in call'
  from faraday/response.rb:61:in `on_complete'
  from faraday/response.rb:8:in `call'
  from octokit/middleware/follow_redirects.rb:73:in `perform_with_redirection'
  from octokit/middleware/follow_redirects.rb:61:in `call'
  from faraday/rack_builder.rb:141:in `build_response'
  from faraday/connection.rb:386:in `run_request'
  from faraday/connection.rb:149:in `get'
  from sawyer/agent.rb:94:in `call'
  from octokit/connection.rb:156:in `request'
  from octokit/connection.rb:84:in `paginate'
  from octokit/client/repository_invitations.rb:72:in `user_repository_invitations'
  from 0pdd.rb:173:in `block in <top (required)>'
  from sinatra/base.rb:1611:in `call'
  from sinatra/base.rb:1611:in `block in compile!'
  from sinatra/base.rb:975:in `block (3 levels) in route!'
  from sinatra/base.rb:994:in `route_eval'
  from sinatra/base.rb:975:in `block (2 levels) in route!'
  from sinatra/base.rb:1015:in `block in process_route'
  from sinatra/base.rb:1013:in `catch'
  from sinatra/base.rb:1013:in `process_route'
  from sinatra/base.rb:973:in `block in route!'
  from sinatra/base.rb:972:in `each'
  from sinatra/base.rb:972:in `route!'
  from sinatra/base.rb:1085:in `block in dispatch!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:1082:in `dispatch!'
  from sinatra/base.rb:907:in `block in call!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:907:in `call!'
  from sinatra/base.rb:895:in `call'
  from rack/protection/xss_header.rb:18:in `call'
  from rack/protection/path_traversal.rb:16:in `call'
  from rack/protection/json_csrf.rb:18:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/frame_options.rb:31:in `call'
  from rack/logger.rb:15:in `call'
  from rack/commonlogger.rb:33:in `call'
  from sinatra/base.rb:219:in `call'
  from sinatra/base.rb:212:in `call'
  from rack/head.rb:13:in `call'
  from rack/methodoverride.rb:22:in `call'
  from sinatra/base.rb:182:in `call'
  from sinatra/base.rb:2013:in `call'
  from sinatra/base.rb:1487:in `block in call'
  from sinatra/base.rb:1787:in `synchronize'
  from sinatra/base.rb:1487:in `call'
  from rack/handler/webrick.rb:88:in `service'
  from webrick/httpserver.rb:140:in `service'
  from webrick/httpserver.rb:96:in `run'
  from webrick/server.rb:296:in `block in start_thread'

I think we need to retry in this case. Or let's think about something smarter.

Aug 11 '17 16:08 yegor256

@0crat assign @tedtoer

Oct 16 '17 07:10 yegor256

@0crat assign @tedtoer (here)

@yegor256 Job gh:yegor256/0pdd#105 assigned to @tedtoer, please go ahead (policy).

Oct 16 '17 07:10 0crat

Bug was reported: +15 points just awarded to @yegor256, total is +4040.

Oct 16 '17 07:10 0crat

@tedtoer this job was assigned to you 8 days ago. It will be taken away from you after 10 days from start (this is our policy).

Nov 14 '17 14:11 0crat

@tedtoer resigned from gh:yegor256/0pdd#105, please stop working.

Nov 14 '17 15:11 0crat

Oops! Job gh:yegor256/0pdd#105 is not assigned to anyone.

Nov 14 '17 15:11 0crat

Oops! Job gh:yegor256/0pdd#105 is not assigned to anyone.

Nov 14 '17 15:11 0crat

Oops! Job gh:yegor256/0pdd#105 is not assigned to anyone.

Nov 14 '17 15:11 0crat

Oops! Job gh:yegor256/0pdd#105 is not assigned to anyone.

Nov 14 '17 15:11 0crat

Job gh:yegor256/0pdd#105 assigned to @tedtoer. The budget is fixed and it is 30 minutes. Please, read the Policy and go ahead.

Dec 18 '17 04:12 0crat

@tedtoer resigned from gh:yegor256/0pdd#105, please stop working.

Dec 28 '17 00:12 0crat

Job gh:yegor256/0pdd#105 assigned to @tedtoer (profile). The budget is fixed and it is 30 minutes. Please, read the Policy and go ahead.

Dec 28 '17 00:12 0crat

An HTTP 502 error is a transient error. https://tools.ietf.org/html/rfc7231#section-6.6.3 This is not a problem with 0pdd. Bug should be closed.

@yegor256 Close this unless you have reproduction steps and this is a regular recurring error.

Apr 03 '18 04:04 SilasReinagel

@yegor256 This ticket needs your attention. Probably should be closed.

Apr 05 '18 05:04 SilasReinagel

@silasreinagel/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.

Apr 07 '18 17:04 0crat

@yegor256 This ticket needs your attention. Close this unless you have reproduction steps and this is a regular recurring error.

Apr 07 '18 18:04 SilasReinagel

@SilasReinagel don't you like the idea of retrying in case of 50x?

Apr 08 '18 05:04 yegor256

@0crat refuse

Apr 11 '18 04:04 SilasReinagel

@0crat refuse (here)

@SilasReinagel The user @silasreinagel/z resigned from #105, please stop working. Reason for job resignation: Order was cancelled

Apr 11 '18 04:04 0crat

Tasks refusal is discouraged, see §6: -15 point(s) just awarded to @silasreinagel/z

Apr 11 '18 04:04 0crat

@yegor256 Building a retry fixture would take more work than the scope of this ticket. If retries are wanted, maybe open a ticket with some retry requirements and that infrastructure can be built?

Apr 11 '18 04:04 SilasReinagel

The job #105 assigned to @izrik/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this

Apr 12 '18 14:04 0crat

@izrik/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.

Apr 17 '18 18:04 0crat

@0crat status

Apr 19 '18 04:04 izrik

The job #105 assigned to @dmydlarz/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this

May 10 '18 11:05 0crat

@0crat status

May 30 '18 11:05 yegor256

@dmydlarz we are experiencing this issue now. What is the status?

May 30 '18 11:05 yegor256

@0crat status (here)

@izrik This is what I know about this job, as in §32:

The job #105 is in scope for 8mon
The role is DEV
The job is not assigned to anyone
The budget is not set yet
These users are banned and won't be assigned:
- @yegor256/z: This user reported the ticket
- @tedtoer/z: User was resigned from the ticket
- @valentjedi/z: User was resigned from the ticket
- @prondzyn/z: User was resigned from the ticket
- @vl3/z: User was resigned from the ticket
- @silasreinagel/z: User was resigned from the ticket
- @izrik/z: User was resigned from the ticket
- @dmydlarz/z: User was resigned from the ticket
Job footprint (restricted area)

May 30 '18 11:05 0crat

@0crat status (here)

@yegor256 This is what I know about this job, as in §32:

The job #105 is in scope for 8mon
The role is DEV
The job is not assigned to anyone
The budget is not set yet
These users are banned and won't be assigned:
- @yegor256/z: This user reported the ticket
- @tedtoer/z: User was resigned from the ticket
- @valentjedi/z: User was resigned from the ticket
- @prondzyn/z: User was resigned from the ticket
- @vl3/z: User was resigned from the ticket
- @silasreinagel/z: User was resigned from the ticket
- @izrik/z: User was resigned from the ticket
- @dmydlarz/z: User was resigned from the ticket
Job footprint (restricted area)

May 30 '18 11:05 0crat

The job #105 assigned to @golyalpha/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this; there will be a monetary reward for this job

Jun 24 '18 08:06 0crat

@yegor256 502 usually means the server was under high load and the proxy had no endpoint to forward your request to at the moment, that the backend was down for some reason or that there was an error in configuration causing the gateway/proxy server being unable to access the upstream server that would normally handle the request and build the dynamic page.

While under certain circumstances, retrying x times in succession might help, in most cases, it won't.

Need a proposal on how to handle this if I'm going to write some code for this issue.

Jun 24 '18 08:06 golyalpha

@golyalpha what would you propose?

Jun 24 '18 09:06 yegor256

@yegor256 Well, if creating an issue with the puzzle on a GitHub repo is a problem, I'd probably put any failed issue creation requests into some kind of a queue (maybe as an entry in the DB), and then, once an issue creation request succeeds, I'd flush the queue onto GitHub as well. Maybe sprinkle in some longer-term retries, for example, if a new issue creation request wasn't made in the past hour, retry one of the requests in the queue.

Or, if we're having it happen in multiple places (as we likely are), then some kind of a generalized queue of failed requests, that would get flushed once a request succeeds.

Jun 24 '18 09:06 golyalpha

@golyalpha let's go with a simple in-memory queue, which will simply help us retry requests in, say, 15 minutes.

Jun 25 '18 03:06 yegor256

@yegor256 0pdd is hosted on Heroku, isn't it? If it is, it means that the service gets restarted roughly every 24 hours. Some requests may end up getting unlucky and lost completely if the service restarts while they are in the queue. Of course, if you insist on having the queue in memory (which, admittedly, is the simplest and fastest to access), then I'll do it that way.

Jun 25 '18 04:06 golyalpha

@golyalpha I don't insist. You think it's better to keep it in DynamoDB?

Jun 25 '18 06:06 yegor256

@yegor256 I'm not entirely sure. How often do these errors happen? (If not very often, then RAM might be better choice)

Jun 26 '18 16:06 golyalpha

@golyalpha not very often, maybe once a week or so

Jun 26 '18 16:06 yegor256

@golyalpha/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.

Jun 29 '18 09:06 0crat

@0crat resign Won't be able to finish in time.

Jul 03 '18 07:07 golyalpha

@0crat resign Won't be able to finish in time. (here)

@golyalpha The user @golyalpha/z resigned from #105, please stop working. Reason for job resignation: Order was cancelled

Jul 03 '18 07:07 0crat

Tasks refusal is discouraged, see §6: -15 point(s) just awarded to @golyalpha/z

Jul 03 '18 07:07 0crat

The job #105 assigned to @timeracers/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this; there will be no monetary reward for this job

Jul 06 '18 11:07 0crat

@timeracers/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.

Jul 11 '18 11:07 0crat

@0crat wait for #228

Jul 12 '18 05:07 timeracers

@0crat wait for #228

Jul 12 '18 05:07 timeracers

@0crat wait for #228 (here)

@timeracers The impediment for #105 was registered successfully by @timeracers/z

Jul 12 '18 05:07 0crat

@0crat wait for #228 (here)

@timeracers Job #105 is already on hold

Jul 12 '18 05:07 0crat

0pdd 0pdd copied to clipboard

Octokit::BadGateway

0pdd
0pdd copied to clipboard