0pdd
0pdd copied to clipboard
Octokit::BadGateway
In production a few hours ago
Octokit::BadGateway: GET https://api.github.com/user/repository_invitations: 502 - Server Error
from octokit/response/raise_error.rb:16:in `on_complete'
from faraday/response.rb:9:in `block in call'
from faraday/response.rb:61:in `on_complete'
from faraday/response.rb:8:in `call'
from octokit/middleware/follow_redirects.rb:73:in `perform_with_redirection'
from octokit/middleware/follow_redirects.rb:61:in `call'
from faraday/rack_builder.rb:141:in `build_response'
from faraday/connection.rb:386:in `run_request'
from faraday/connection.rb:149:in `get'
from sawyer/agent.rb:94:in `call'
from octokit/connection.rb:156:in `request'
from octokit/connection.rb:84:in `paginate'
from octokit/client/repository_invitations.rb:72:in `user_repository_invitations'
from 0pdd.rb:173:in `block in <top (required)>'
from sinatra/base.rb:1611:in `call'
from sinatra/base.rb:1611:in `block in compile!'
from sinatra/base.rb:975:in `block (3 levels) in route!'
from sinatra/base.rb:994:in `route_eval'
from sinatra/base.rb:975:in `block (2 levels) in route!'
from sinatra/base.rb:1015:in `block in process_route'
from sinatra/base.rb:1013:in `catch'
from sinatra/base.rb:1013:in `process_route'
from sinatra/base.rb:973:in `block in route!'
from sinatra/base.rb:972:in `each'
from sinatra/base.rb:972:in `route!'
from sinatra/base.rb:1085:in `block in dispatch!'
from sinatra/base.rb:1067:in `block in invoke'
from sinatra/base.rb:1067:in `catch'
from sinatra/base.rb:1067:in `invoke'
from sinatra/base.rb:1082:in `dispatch!'
from sinatra/base.rb:907:in `block in call!'
from sinatra/base.rb:1067:in `block in invoke'
from sinatra/base.rb:1067:in `catch'
from sinatra/base.rb:1067:in `invoke'
from sinatra/base.rb:907:in `call!'
from sinatra/base.rb:895:in `call'
from rack/protection/xss_header.rb:18:in `call'
from rack/protection/path_traversal.rb:16:in `call'
from rack/protection/json_csrf.rb:18:in `call'
from rack/protection/base.rb:49:in `call'
from rack/protection/base.rb:49:in `call'
from rack/protection/frame_options.rb:31:in `call'
from rack/logger.rb:15:in `call'
from rack/commonlogger.rb:33:in `call'
from sinatra/base.rb:219:in `call'
from sinatra/base.rb:212:in `call'
from rack/head.rb:13:in `call'
from rack/methodoverride.rb:22:in `call'
from sinatra/base.rb:182:in `call'
from sinatra/base.rb:2013:in `call'
from sinatra/base.rb:1487:in `block in call'
from sinatra/base.rb:1787:in `synchronize'
from sinatra/base.rb:1487:in `call'
from rack/handler/webrick.rb:88:in `service'
from webrick/httpserver.rb:140:in `service'
from webrick/httpserver.rb:96:in `run'
from webrick/server.rb:296:in `block in start_thread'
I think we need to retry in this case. Or let's think about something smarter.
@0crat assign @tedtoer
@0crat assign @tedtoer (here)
@yegor256 Job gh:yegor256/0pdd#105
assigned to @tedtoer, please go ahead (policy).
Bug was reported: +15 points just awarded to @yegor256, total is +4040.
@tedtoer this job was assigned to you 8 days ago. It will be taken away from you after 10 days from start (this is our policy).
@tedtoer resigned from gh:yegor256/0pdd#105
, please stop working.
Oops! Job gh:yegor256/0pdd#105
is not assigned to anyone.
Oops! Job gh:yegor256/0pdd#105
is not assigned to anyone.
Oops! Job gh:yegor256/0pdd#105
is not assigned to anyone.
Oops! Job gh:yegor256/0pdd#105
is not assigned to anyone.
Job gh:yegor256/0pdd#105
assigned to @tedtoer. The budget is fixed and it is 30 minutes. Please, read the Policy and go ahead.
@tedtoer resigned from gh:yegor256/0pdd#105
, please stop working.
Job gh:yegor256/0pdd#105
assigned to @tedtoer (profile). The budget is fixed and it is 30 minutes. Please, read the Policy and go ahead.
An HTTP 502 error is a transient error. https://tools.ietf.org/html/rfc7231#section-6.6.3 This is not a problem with 0pdd. Bug should be closed.
@yegor256 Close this unless you have reproduction steps and this is a regular recurring error.
@yegor256 This ticket needs your attention. Probably should be closed.
@silasreinagel/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.
@yegor256 This ticket needs your attention. Close this unless you have reproduction steps and this is a regular recurring error.
@SilasReinagel don't you like the idea of retrying in case of 50x?
@0crat refuse
@0crat refuse (here)
@SilasReinagel The user @silasreinagel/z resigned from #105, please stop working. Reason for job resignation: Order was cancelled
@yegor256 Building a retry fixture would take more work than the scope of this ticket. If retries are wanted, maybe open a ticket with some retry requirements and that infrastructure can be built?
The job #105 assigned to @izrik/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this
@izrik/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.
@0crat status
The job #105 assigned to @dmydlarz/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this
@0crat status
@dmydlarz we are experiencing this issue now. What is the status?
@0crat status (here)
@izrik This is what I know about this job, as in §32:
- The job #105 is in scope for 8mon
- The role is
DEV
- The job is not assigned to anyone
- The budget is not set yet
- These users are banned and won't be assigned:
- @yegor256/z: This user reported the ticket
- @tedtoer/z: User was resigned from the ticket
- @valentjedi/z: User was resigned from the ticket
- @prondzyn/z: User was resigned from the ticket
- @vl3/z: User was resigned from the ticket
- @silasreinagel/z: User was resigned from the ticket
- @izrik/z: User was resigned from the ticket
- @dmydlarz/z: User was resigned from the ticket
- Job footprint (restricted area)
@0crat status (here)
@yegor256 This is what I know about this job, as in §32:
- The job #105 is in scope for 8mon
- The role is
DEV
- The job is not assigned to anyone
- The budget is not set yet
- These users are banned and won't be assigned:
- @yegor256/z: This user reported the ticket
- @tedtoer/z: User was resigned from the ticket
- @valentjedi/z: User was resigned from the ticket
- @prondzyn/z: User was resigned from the ticket
- @vl3/z: User was resigned from the ticket
- @silasreinagel/z: User was resigned from the ticket
- @izrik/z: User was resigned from the ticket
- @dmydlarz/z: User was resigned from the ticket
- Job footprint (restricted area)
The job #105 assigned to @golyalpha/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this; there will be a monetary reward for this job
@yegor256 502 usually means the server was under high load and the proxy had no endpoint to forward your request to at the moment, that the backend was down for some reason or that there was an error in configuration causing the gateway/proxy server being unable to access the upstream server that would normally handle the request and build the dynamic page.
While under certain circumstances, retrying x times in succession might help, in most cases, it won't.
Need a proposal on how to handle this if I'm going to write some code for this issue.
@golyalpha what would you propose?
@yegor256 Well, if creating an issue with the puzzle on a GitHub repo is a problem, I'd probably put any failed issue creation requests into some kind of a queue (maybe as an entry in the DB), and then, once an issue creation request succeeds, I'd flush the queue onto GitHub as well. Maybe sprinkle in some longer-term retries, for example, if a new issue creation request wasn't made in the past hour, retry one of the requests in the queue.
Or, if we're having it happen in multiple places (as we likely are), then some kind of a generalized queue of failed requests, that would get flushed once a request succeeds.
@golyalpha let's go with a simple in-memory queue, which will simply help us retry requests in, say, 15 minutes.
@yegor256 0pdd is hosted on Heroku, isn't it? If it is, it means that the service gets restarted roughly every 24 hours. Some requests may end up getting unlucky and lost completely if the service restarts while they are in the queue. Of course, if you insist on having the queue in memory (which, admittedly, is the simplest and fastest to access), then I'll do it that way.
@golyalpha I don't insist. You think it's better to keep it in DynamoDB?
@yegor256 I'm not entirely sure. How often do these errors happen? (If not very often, then RAM might be better choice)
@golyalpha not very often, maybe once a week or so
@golyalpha/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.
@0crat resign Won't be able to finish in time.
@0crat resign Won't be able to finish in time. (here)
@golyalpha The user @golyalpha/z resigned from #105, please stop working. Reason for job resignation: Order was cancelled
The job #105 assigned to @timeracers/z, here is why; the budget is 30 minutes, see §4; please, read §8 and §9; if the task is not clear, read this and this; there will be no monetary reward for this job
@timeracers/z this job was assigned to you 5days ago. It will be taken away from you soon, unless you close it, see §8. Read this and this, please.
@0crat wait for #228
@0crat wait for #228