atlantis
atlantis copied to clipboard
With defined gh-team-allowlist Atlantis randomly stops working with 401 Unauthorized body
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Overview of the Issue
With defined gh-team-allowlist
, Atlantis randomly stops working with the following error when running the plan:
{"level":"error","ts":"2022-04-05T15:33:04.300Z","caller":"events/command_runner.go:219","msg":"Unable to check user permissions: non-200 OK status code: 401 Unauthorized body: \"{\\\"message\\\":\\\"Bad credentials\\\",\\\"documentation_url\\\":\\\"https://docs.github.com/graphql\\\"}\"","json":{},"stacktrace":"github.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\tgithub.com/runatlantis/atlantis/server/events/command_runner.go:219"}
A restart of the pod fixes it, but it breaks again after a few hours.
Atlantis version: v0.19.2
Config:
disable-apply-all: true
enable-diff-markdown-format: true
enable-regexp-cmd: true
gh-app-id: <ID>
gh-app-key-file: /atlantis/gh-app-key-file.pem
gh-app-slug: atlantis-faire
gh-org: Faire
gh-team-allowlist: "*:plan,*:unlock,backend-platform:*,data-infra:*"
gh-webhook-secret: <SECRET>
hide-prev-plan-comments: true
write-git-creds: true
I also tried v0.19.1
, but it failed with the following error:
"Unable to check user permissions: struct field for \"__schema\" doesn't exist in any of 1 places to unmarshal
However, this is expected, from release notes in the latest version.
the struct issue you are reporting was fix in https://github.com/runatlantis/atlantis/pull/2128
I'm not reporting that issue in this one. This is non-200 OK status code: 401 Unauthorized body
with v0.19.2
.
I just mentioned that I tried v0.19.1
as well and got the issue that is already fixed, but that is ok and expected.
understood
@komljen I found this article. Might be related to rate limit and the misleading error message. Could you check your rate limit when it happens again?
Interesting, will check that and report on the findings.
This is an interesting finding https://github.com/runatlantis/atlantis/issues/2285#issuecomment-1152365866 So, it works with token auth but doesn't with GH App.
@komljen Yeah, we've now been able to run for multiple days with 0.19.3 and the user+token authentication instead of GH App. With the GH App authentication, we could only go a few hours at most.
With 0.17.5, the GH App route worked perfectly fine.
interesting:
if you switch right now to GH app and 0.17.5 with gh-team-allowlist does it work for you?
I'm trying to understand why this could be.
On Tue., Jun. 14, 2022, 6:12 a.m. cjbehm, @.***> wrote:
@komljen https://github.com/komljen Yeah, we've now been able to run for multiple days with 0.19.3 and the user+token authentication instead of GH App. With the GH App authentication, we could only go a few hours at most.
With 0.17.5, the GH App route worked perfectly fine.
— Reply to this email directly, view it on GitHub https://github.com/runatlantis/atlantis/issues/2187#issuecomment-1155168010, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ3ERHGFHB6UKV3NFEPI4LVPCALRANCNFSM5SVRFIRA . You are receiving this because you commented.Message ID: @.***>
@jamengual I'm not using the GH team allow list feature, just was confirming @komljen 's comment, so I can't test that out (also the gh team list feature was added in 0.18 and moved to GraphQL in 0.18.3)
I do think that #2285 and this issue could be the same root cause, but I created that issue specifically because our errors arose without using any new features; just as a pure version upgrade.
I'm starting to believe the API call throttling issue is what is causing this and the error message does not help much.
I'm hoping Github API will be more descriptive of the real issue behind it and hopefully expose metrics around API calls.
On Tue, Jun 14, 2022 at 9:02 AM cjbehm @.***> wrote:
@jamengual https://github.com/jamengual I'm not using the GH team allow list feature, just was confirming @komljen https://github.com/komljen 's comment, so I can't test that out (also the gh team list feature was added in 0.18 and moved to GraphQL in 0.18.3)
I do think that #2285 https://github.com/runatlantis/atlantis/issues/2285 and this issue could be the same root cause, but I created that issue specifically because our errors arose without using any new features; just as a pure version upgrade.
— Reply to this email directly, view it on GitHub https://github.com/runatlantis/atlantis/issues/2187#issuecomment-1155399198, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ3ERE5O4YMWTKOTWZYJGTVPCULFANCNFSM5SVRFIRA . You are receiving this because you were mentioned.Message ID: @.***>
Could Atlantis request and log rate limit info in its query per https://docs.github.com/en/graphql/overview/resource-limitations ?
It's hard to imagine throttling as the source when our problem in #2285 disappeared by switching to token auth instead of GH App, but GitHub's API response on its own is nearly useless.
I'm starting to believe the API call throttling issue is what is causing this and the error message does not help much. I'm hoping Github API will be more descriptive of the real issue behind it and hopefully expose metrics around API calls. …
On Tue, Jun 14, 2022 at 11:36 AM cjbehm @.***> wrote:
Could Atlantis request and log rate limit info in its query per https://docs.github.com/en/graphql/overview/resource-limitations ?
PRs are welcome
It's hard to imagine throttling as the source when our problem in #2285 https://github.com/runatlantis/atlantis/issues/2285 disappeared by switching to token auth instead of GH App, but GitHub's API response on its own is nearly useless.
exactly, how do we know if the response is so cryptic?
I'm starting to believe the API call throttling issue is what is causing this and the error message does not help much. I'm hoping Github API will be more descriptive of the real issue behind it and hopefully expose metrics around API calls. … <#m_-2437748453883505730_>
— Reply to this email directly, view it on GitHub https://github.com/runatlantis/atlantis/issues/2187#issuecomment-1155586481, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ3ERBVVNXDJGDRAQ7RJGTVPDGJDANCNFSM5SVRFIRA . You are receiving this because you were mentioned.Message ID: @.***>
is this still happening in v0.19.8
?
is this still happening in
v0.19.8
?
I didn't try that version yet but will wait for this PR https://github.com/runatlantis/atlantis/pull/2479. Seems like a proper fix for this issue.
that is correct, I think that is going to be the fix.
It should be available today in the pre-release
On Thu, Sep 8, 2022 at 2:00 AM Alen Komljen @.***> wrote:
is this still happening in v0.19.8?
I didn't try that version yet but will wait for this PR #2479 https://github.com/runatlantis/atlantis/pull/2479. Seems like a proper fix for this issue.
— Reply to this email directly, view it on GitHub https://github.com/runatlantis/atlantis/issues/2187#issuecomment-1240432103, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ3ERA2GYEZSPTZRR3HWYTV5GTJRANCNFSM5SVRFIRA . You are receiving this because you were mentioned.Message ID: @.***>
+1
this has been already fixed, test the new version
On Mon, Oct 3, 2022 at 12:39 PM Julliano Goncalves @.***> wrote:
+1
— Reply to this email directly, view it on GitHub https://github.com/runatlantis/atlantis/issues/2187#issuecomment-1265942290, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ3ERBHQ3EMI6UHBAYBHJ3WBMY7JANCNFSM5SVRFIRA . You are receiving this because you were mentioned.Message ID: @.***>
Yes, forgot to update here, but no issues with the latest version.
this has been already fixed, test the new version
we are still hitting it with latest 0.19.8
edit: https://github.com/runatlantis/atlantis/commit/a4a49bf46fb2ea83804d7b8fa2dae3e4c5646a01 i see this is in 0.19.9 :crossed_fingers: