core-workflow icon indicating copy to clipboard operation
core-workflow copied to clipboard

Cannot merge backported PRs because "bedevere/maintenance-branch-pr" not run

Open ned-deily opened this issue 4 years ago • 15 comments

See, for example, python/cpython#20305 and python/cpython#20306. For some reason, it seems that the bedevere/maintenance-branch-pr check did not run (yet); the status is: Expected -Waiting for status to be reported. Because it is a required check, the PRs can not be merged either manually or automatically.

ned-deily avatar May 21 '20 23:05 ned-deily

I noticed the bot reacts to edits, so I edited the titles of those two PRs slightly to get them merged. I don't know why the bot didn't wake up at first :(

encukou avatar May 21 '20 23:05 encukou

Oh, thanks, @encukou ! Still, somebody should look at this. I don't recall ever seeing this behavior before.

ned-deily avatar May 21 '20 23:05 ned-deily

I could see several few timeout errors from the webhook deliveries, so the bot could not perform the task. For those with administrative power, they can go to Cpython's repo > Settings tab > Webhoks and click on the webhook for bedevere. Normally, redelivering the webhook would fix it.

Honestly I don't know what we can do from our side to fix the problem. But I have seen this happening a few times.

Screen Shot 2020-05-21 at 4 48 59 PM

Mariatta avatar May 21 '20 23:05 Mariatta

The vast majority of bot issues are due to a failure to deliver the webhook event, and that usually lies somewhere between GitHub and Heroku. As such, typically the easiest thing to do it do something which triggers another. event (e.g. edits, opening and closing, etc.). Otherwise you can also check the commit history of the bots to see if something recently changed (which is usually not the case).

brettcannon avatar May 22 '20 19:05 brettcannon

Re: timeouts in Heroku, I see now that some requests to bedevere are taking more than 30ms which caused the timeout. But I still don't actually know what to do to avoid the timeout.

I wonder if it's a matter of upgrading the dyno?

Mariatta avatar May 25 '20 17:05 Mariatta

screenshot from heroku dashboard Screen Shot 2020-05-25 at 10 44 47 AM

Mariatta avatar May 25 '20 17:05 Mariatta

I wonder if it's from a dyno spinning up? Do we have permanently running dynos?

brettcannon avatar May 25 '20 18:05 brettcannon

We are using the hobby dyno. Perhaps we need one of the Professional Dynos? https://www.heroku.com/pricing

We can probably take a closer look at webhook events that have been timing out and see how to we can improve the performance.

Mariatta avatar Jun 21 '20 19:06 Mariatta

If it's the matter of hosting a webhook have you considered using something like https://workers.cloudflare.com/ or AWS Lambda, I bet it would be much cheaper or might fit the free tier. Let me know if I could help with it, I would be happy to help.

Thanks, Smit

On Sun, Jun 21, 2020 at 11:55 PM Mariatta [email protected] wrote:

We are using the hobby dyno. Perhaps we need one of the Professional Dynos? https://www.heroku.com/pricing

We can probably take a closer look at webhook events that have been timing out and see how to we can improve the performance.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/python/core-workflow/issues/370#issuecomment-647174129, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN3OKXBRNJMN373SYR4RVTRXZQTPANCNFSM4NHJAEYQ .

smitthakkar96 avatar Jun 21 '20 20:06 smitthakkar96

The PSF receive Heroku credits and most of our webservices are on Heroku whenever possible. Perhaps PSF infrastructure team can weigh in cc @ewdurbin

Mariatta avatar Jun 21 '20 21:06 Mariatta

Yes, we utilize donated services (~$750 service credit per month) from Heroku for all of our bots at the moment. Hobby dynos should not sleep.

If we think its an issue with resources and not some other factor (code, db, or blocking I/O) we can upgrade the dynos... but ultimately that gets very expensive ($7/mo in credits to $25/mo in credits). If we think it would help... it might require consolidating our fleet of bots into a single app instance somehow.

ewdurbin avatar Jun 22 '20 12:06 ewdurbin

Shouldn’t we consider contacting AWS or GCP guys and see if we can get some free credits from them, I am sure they would be glad to support PSF. I believe running the webhooks on serverless service like lambda or cloud functions will be considerably cheaper and we might not require to consolidate bots into a single app instance. Thoughts?

On Mon, Jun 22, 2020 at 16:35 Ernest W. Durbin III [email protected] wrote:

Yes, we utilize donated services (~$750 service credit per month) from Heroku for all of our bots at the moment. Hobby dynos should not sleep.

If we think its an issue with resources and not some other factor (code, db, or blocking I/O) we can upgrade the dynos... but ultimately that gets very expensive ($7/mo in credits to $25/mo in credits). If we think it would help... it might require consolidating our fleet of bots into a single app instance somehow.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/python/core-workflow/issues/370#issuecomment-647490097, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN3OKU42YD33MWO3WZWZITRX5F2HANCNFSM4NHJAEYQ .

smitthakkar96 avatar Jun 26 '20 12:06 smitthakkar96

We do also have service credits with other cloud vendors. If there is a volunteer that is willing to port and maintain the bots on a serverless platform, that could work.

At the moment, Heroku provides a common platform and easy ACL for contributors. PSF Infrastructure can support these additional needs if the current maintainers are on board.

ewdurbin avatar Jun 26 '20 14:06 ewdurbin

I'm personally not interested in migrating to another platform unless it's Azure (since I already know it and can get help internally), or moving the check to GitHub Actions (since it's even more accessible to people and I'm also familiar with it).

brettcannon avatar Jun 26 '20 17:06 brettcannon

Moving check to Github actions sounds like an amazing idea. If there is something that Github actions cannot do I guess we can consider using serverless offerings of Microsoft Azure if that’s something the team wants. I can help with it (any porting required plus setup on Azure), we can use Terraform or Pulumi to provision the resources on Azure. There is support for IAM on Azure plus there are lot of open source projects that can help integrate Github teams with Azure IAM.

Thanks, Smit

On Fri, Jun 26, 2020 at 21:54 Brett Cannon [email protected] wrote:

I'm personally not interested in migrating to another platform unless it's Azure (since I already know it and can get help internally), or moving the check to GitHub Actions (since it's even more accessible to people and I'm also familiar with it).

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/python/core-workflow/issues/370#issuecomment-650314523, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN3OKR42EN65NSAPYGOSWLRYTOD3ANCNFSM4NHJAEYQ .

smitthakkar96 avatar Jun 27 '20 20:06 smitthakkar96