0pdd icon indicating copy to clipboard operation
0pdd copied to clipboard

Octokit::BadGateway

Open yegor256 opened this issue 7 years ago • 48 comments

In production a few hours ago

Octokit::BadGateway: GET https://api.github.com/user/repository_invitations: 502 - Server Error
  from octokit/response/raise_error.rb:16:in `on_complete'
  from faraday/response.rb:9:in `block in call'
  from faraday/response.rb:61:in `on_complete'
  from faraday/response.rb:8:in `call'
  from octokit/middleware/follow_redirects.rb:73:in `perform_with_redirection'
  from octokit/middleware/follow_redirects.rb:61:in `call'
  from faraday/rack_builder.rb:141:in `build_response'
  from faraday/connection.rb:386:in `run_request'
  from faraday/connection.rb:149:in `get'
  from sawyer/agent.rb:94:in `call'
  from octokit/connection.rb:156:in `request'
  from octokit/connection.rb:84:in `paginate'
  from octokit/client/repository_invitations.rb:72:in `user_repository_invitations'
  from 0pdd.rb:173:in `block in <top (required)>'
  from sinatra/base.rb:1611:in `call'
  from sinatra/base.rb:1611:in `block in compile!'
  from sinatra/base.rb:975:in `block (3 levels) in route!'
  from sinatra/base.rb:994:in `route_eval'
  from sinatra/base.rb:975:in `block (2 levels) in route!'
  from sinatra/base.rb:1015:in `block in process_route'
  from sinatra/base.rb:1013:in `catch'
  from sinatra/base.rb:1013:in `process_route'
  from sinatra/base.rb:973:in `block in route!'
  from sinatra/base.rb:972:in `each'
  from sinatra/base.rb:972:in `route!'
  from sinatra/base.rb:1085:in `block in dispatch!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:1082:in `dispatch!'
  from sinatra/base.rb:907:in `block in call!'
  from sinatra/base.rb:1067:in `block in invoke'
  from sinatra/base.rb:1067:in `catch'
  from sinatra/base.rb:1067:in `invoke'
  from sinatra/base.rb:907:in `call!'
  from sinatra/base.rb:895:in `call'
  from rack/protection/xss_header.rb:18:in `call'
  from rack/protection/path_traversal.rb:16:in `call'
  from rack/protection/json_csrf.rb:18:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/base.rb:49:in `call'
  from rack/protection/frame_options.rb:31:in `call'
  from rack/logger.rb:15:in `call'
  from rack/commonlogger.rb:33:in `call'
  from sinatra/base.rb:219:in `call'
  from sinatra/base.rb:212:in `call'
  from rack/head.rb:13:in `call'
  from rack/methodoverride.rb:22:in `call'
  from sinatra/base.rb:182:in `call'
  from sinatra/base.rb:2013:in `call'
  from sinatra/base.rb:1487:in `block in call'
  from sinatra/base.rb:1787:in `synchronize'
  from sinatra/base.rb:1487:in `call'
  from rack/handler/webrick.rb:88:in `service'
  from webrick/httpserver.rb:140:in `service'
  from webrick/httpserver.rb:96:in `run'
  from webrick/server.rb:296:in `block in start_thread'

I think we need to retry in this case. Or let's think about something smarter.

yegor256 avatar Aug 11 '17 16:08 yegor256

@0crat assign @tedtoer

yegor256 avatar Oct 16 '17 07:10 yegor256

@0crat assign @tedtoer (here)

@yegor256 Job gh:yegor256/0pdd#105 assigned to @tedtoer, please go ahead (policy).

0crat avatar Oct 16 '17 07:10 0crat

Bug was reported: +15 points just awarded to @yegor256, total is +4040.

0crat avatar Oct 16 '17 07:10 0crat

@tedtoer this job was assigned to you 8 days ago. It will be taken away from you after 10 days from start (this is our policy).

0crat avatar Nov 14 '17 14:11 0crat

@tedtoer resigned from gh:yegor256/0pdd#105, please stop working.

0crat avatar Nov 14 '17 15:11 0crat

Oops! Job gh:yegor256/0pdd#105 is not assigned to anyone.

0crat avatar Nov 14 '17 15:11 0crat

Oops! Job gh:yegor256/0pdd#105 is not assigned to anyone.

0crat avatar Nov 14 '17 15:11 0crat

Oops! Job gh:yegor256/0pdd#105 is not assigned to anyone.

0crat avatar Nov 14 '17 15:11 0crat

Oops! Job gh:yegor256/0pdd#105 is not assigned to anyone.

0crat avatar Nov 14 '17 15:11 0crat

Job gh:yegor256/0pdd#105 assigned to @tedtoer. The budget is fixed and it is 30 minutes. Please, read the Policy and go ahead.

0crat avatar Dec 18 '17 04:12 0crat

@tedtoer resigned from gh:yegor256/0pdd#105, please stop working.

0crat avatar Dec 28 '17 00:12 0crat

Job gh:yegor256/0pdd#105 assigned to @tedtoer (profile). The budget is fixed and it is 30 minutes. Please, read the Policy and go ahead.

0crat avatar Dec 28 '17 00:12 0crat

An HTTP 502 error is a transient error. https://tools.ietf.org/html/rfc7231#section-6.6.3 This is not a problem with 0pdd. Bug should be closed.

@yegor256 Close this unless you have reproduction steps and this is a regular recurring error.

SilasReinagel avatar Apr 03 '18 04:04 SilasReinagel

@yegor256 This ticket needs your attention. Probably should be closed.

SilasReinagel avatar Apr 05 '18 05:04 SilasReinagel

@silasreinagel/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.

0crat avatar Apr 07 '18 17:04 0crat

@yegor256 This ticket needs your attention. Close this unless you have reproduction steps and this is a regular recurring error.

SilasReinagel avatar Apr 07 '18 18:04 SilasReinagel

@SilasReinagel don't you like the idea of retrying in case of 50x?

yegor256 avatar Apr 08 '18 05:04 yegor256

@0crat refuse

SilasReinagel avatar Apr 11 '18 04:04 SilasReinagel

@0crat refuse (here)

@SilasReinagel The user @silasreinagel/z resigned from #105, please stop working. Reason for job resignation: Order was cancelled

0crat avatar Apr 11 '18 04:04 0crat

Tasks refusal is discouraged, see §6: -15 point(s) just awarded to @silasreinagel/z

0crat avatar Apr 11 '18 04:04 0crat

@yegor256 Building a retry fixture would take more work than the scope of this ticket. If retries are wanted, maybe open a ticket with some retry requirements and that infrastructure can be built?

SilasReinagel avatar Apr 11 '18 04:04 SilasReinagel

The job #105 assigned to @izrik/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this

0crat avatar Apr 12 '18 14:04 0crat

@izrik/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.

0crat avatar Apr 17 '18 18:04 0crat

@0crat status

izrik avatar Apr 19 '18 04:04 izrik

The job #105 assigned to @dmydlarz/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this

0crat avatar May 10 '18 11:05 0crat

@0crat status

yegor256 avatar May 30 '18 11:05 yegor256

@dmydlarz we are experiencing this issue now. What is the status?

yegor256 avatar May 30 '18 11:05 yegor256

@0crat status (here)

@izrik This is what I know about this job, as in §32:

  • The job #105 is in scope for 8mon
  • The role is DEV
  • The job is not assigned to anyone
  • The budget is not set yet
  • These users are banned and won't be assigned:
    • @yegor256/z: This user reported the ticket
    • @tedtoer/z: User was resigned from the ticket
    • @valentjedi/z: User was resigned from the ticket
    • @prondzyn/z: User was resigned from the ticket
    • @vl3/z: User was resigned from the ticket
    • @silasreinagel/z: User was resigned from the ticket
    • @izrik/z: User was resigned from the ticket
    • @dmydlarz/z: User was resigned from the ticket
  • Job footprint (restricted area)

0crat avatar May 30 '18 11:05 0crat

@0crat status (here)

@yegor256 This is what I know about this job, as in §32:

  • The job #105 is in scope for 8mon
  • The role is DEV
  • The job is not assigned to anyone
  • The budget is not set yet
  • These users are banned and won't be assigned:
    • @yegor256/z: This user reported the ticket
    • @tedtoer/z: User was resigned from the ticket
    • @valentjedi/z: User was resigned from the ticket
    • @prondzyn/z: User was resigned from the ticket
    • @vl3/z: User was resigned from the ticket
    • @silasreinagel/z: User was resigned from the ticket
    • @izrik/z: User was resigned from the ticket
    • @dmydlarz/z: User was resigned from the ticket
  • Job footprint (restricted area)

0crat avatar May 30 '18 11:05 0crat

The job #105 assigned to @golyalpha/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this; there will be a monetary reward for this job

0crat avatar Jun 24 '18 08:06 0crat

@yegor256 502 usually means the server was under high load and the proxy had no endpoint to forward your request to at the moment, that the backend was down for some reason or that there was an error in configuration causing the gateway/proxy server being unable to access the upstream server that would normally handle the request and build the dynamic page.

While under certain circumstances, retrying x times in succession might help, in most cases, it won't.

Need a proposal on how to handle this if I'm going to write some code for this issue.

golyalpha avatar Jun 24 '18 08:06 golyalpha

@golyalpha what would you propose?

yegor256 avatar Jun 24 '18 09:06 yegor256

@yegor256 Well, if creating an issue with the puzzle on a GitHub repo is a problem, I'd probably put any failed issue creation requests into some kind of a queue (maybe as an entry in the DB), and then, once an issue creation request succeeds, I'd flush the queue onto GitHub as well. Maybe sprinkle in some longer-term retries, for example, if a new issue creation request wasn't made in the past hour, retry one of the requests in the queue.

Or, if we're having it happen in multiple places (as we likely are), then some kind of a generalized queue of failed requests, that would get flushed once a request succeeds.

golyalpha avatar Jun 24 '18 09:06 golyalpha

@golyalpha let's go with a simple in-memory queue, which will simply help us retry requests in, say, 15 minutes.

yegor256 avatar Jun 25 '18 03:06 yegor256

@yegor256 0pdd is hosted on Heroku, isn't it? If it is, it means that the service gets restarted roughly every 24 hours. Some requests may end up getting unlucky and lost completely if the service restarts while they are in the queue. Of course, if you insist on having the queue in memory (which, admittedly, is the simplest and fastest to access), then I'll do it that way.

golyalpha avatar Jun 25 '18 04:06 golyalpha

@golyalpha I don't insist. You think it's better to keep it in DynamoDB?

yegor256 avatar Jun 25 '18 06:06 yegor256

@yegor256 I'm not entirely sure. How often do these errors happen? (If not very often, then RAM might be better choice)

golyalpha avatar Jun 26 '18 16:06 golyalpha

@golyalpha not very often, maybe once a week or so

yegor256 avatar Jun 26 '18 16:06 yegor256

@golyalpha/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.

0crat avatar Jun 29 '18 09:06 0crat

@0crat resign Won't be able to finish in time.

golyalpha avatar Jul 03 '18 07:07 golyalpha

@0crat resign Won't be able to finish in time. (here)

@golyalpha The user @golyalpha/z resigned from #105, please stop working. Reason for job resignation: Order was cancelled

0crat avatar Jul 03 '18 07:07 0crat

Tasks refusal is discouraged, see §6: -15 point(s) just awarded to @golyalpha/z

0crat avatar Jul 03 '18 07:07 0crat

The job #105 assigned to @timeracers/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this; there will be no monetary reward for this job

0crat avatar Jul 06 '18 11:07 0crat

@timeracers/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.

0crat avatar Jul 11 '18 11:07 0crat

@0crat wait for #228

timeracers avatar Jul 12 '18 05:07 timeracers

@0crat wait for #228

timeracers avatar Jul 12 '18 05:07 timeracers

@0crat wait for #228 (here)

@timeracers The impediment for #105 was registered successfully by @timeracers/z

0crat avatar Jul 12 '18 05:07 0crat

@0crat wait for #228 (here)

@timeracers Job #105 is already on hold

0crat avatar Jul 12 '18 05:07 0crat