beam
beam copied to clipboard
The PostCommit Python Arm job is flaky
The PostCommit Python Arm is failing over 50% of the time Please visit https://github.com/apache/beam/actions/workflows/beam_PostCommit_Python_Arm.yml?query=is%3Afailure+branch%3Amaster to see the logs.
@tvalentyn do we have a good owner for this ?
I actually can't find a single green run since this test suite was created (back in September)
You may be right, thanks for correction, @ahmedabu98
2024-04-24T12:03:53.0963029Z Please verify that you have permissions to write to the parent directory..
2024-04-24T12:03:53.0964903Z The configuration directory may not be writable. To learn more, see https://cloud.google.com/sdk/docs/configurations#creating_a_configuration
2024-04-24T12:03:53.0968080Z ERROR: (gcloud.auth.docker-helper) Could not create directory [/var/lib/kubelet/pods/573a1844-124b-4e12-bb0f-0325d0f3c3aa/volumes/kubernetes.io~empty-dir/gcloud]: Permission denied.
2024-04-24T12:03:53.0969612Z
2024-04-24T12:03:53.0970063Z Please verify that you have permissions to write to the parent directory.
2024-04-24T12:03:53.3953756Z #29 pushing layers 1.4s done
2024-04-24T12:03:53.3956208Z #29 ERROR: failed to push us.gcr.io/apache-beam-testing/github-actions/beam_python3.8_sdk:2.57.0-SNAPSHOT: error getting credentials - err: exit status 1, out: ``
2024-04-24T12:03:53.8953735Z ------
cc: @damccorm - do you remember if this suite never worked or the above error is an artifact of GHA migration?
We can reclassify this as part part of ARM backlog work.
This was working last month - https://github.com/apache/beam/actions/workflows/beam_PostCommit_Python_Arm.yml?query=is%3Asuccess+branch%3Amaster+event%3Aschedule
Looks like it went flaky then permared around then
Ahh my apologies, I was looking at it through a is:failure filter
So by removing https://github.com/apache/beam/blob/master/.github/workflows/beam_PostCommit_Python_Arm.yml#L113
I get the test to move along but its still failing on my fork due to some permission with the Healthcare api. Oauth scope is wrong or something: https://github.com/volatilemolotov/beam/actions/runs/8820257015/job/24213449686#step:13:13113
@volatilemolotov could you put up a PR to make that change? Definitely seems like it is getting further.
@svetakvsundhar do you know what scope is missing? Given the normal postcommit python isn't failing, it might just be an issue with your service account specifically?
Sure, here it is https://github.com/apache/beam/pull/31102
Thanks - merged, lets see what the result on master is
@svetakvsundhar do you know what scope is missing? Given the normal postcommit python isn't failing, it might just be an issue with your service account specifically?
+1, it could be a service account specific issue. I'd want to see a couple of more runs of this to see if it's actually an issue. If so, a thought might be to add ["https://www.googleapis.com/auth/cloud-platform"] as a scope manually in the test.
https://github.com/apache/beam/actions/runs/8840477636
it works now
Great, thanks @volatilemolotov
Looks like we're still flaky - https://github.com/apache/beam/actions/runs/8843342204/job/24283441647 - but that's an improvement and it looks like a test flake instead of infra
Permared now
I think that's wrong - https://github.com/apache/beam/actions/workflows/beam_PostCommit_Python_Arm.yml?query=branch%3Amaster+event%3Aschedule
Reopening since the workflow is still flaky
Fixed by https://github.com/apache/beam/pull/32530