The PostCommit TransformService Direct job is flaky
The PostCommit TransformService Direct is failing over 50% of the time Please visit https://github.com/apache/beam/actions/workflows/beam_PostCommit_TransformService_Direct.yml?query=is%3Afailure+branch%3Amaster to see the logs.
https://github.com/apache/beam/pull/30816 breaks the build here.
#10 43.81 INFO: pip is looking at multiple versions of apache-beam[dataframe,gcp] to determine which version is compatible with other requirements. This could take a while.
#10 43.81 ERROR: Cannot install apache-beam[dataframe,gcp]==2.56.0.dev0 because these package versions have conflicting dependencies.
#10 43.81
#10 43.81 The conflict is caused by:
#10 43.81 apache-beam[dataframe,gcp] 2.56.0.dev0 depends on google-auth-httplib2<0.2.0 and >=0.1.0; extra == "gcp"
#10 43.81 The user requested (constraint) google-auth-httplib2==0.2.0
we have 'google-auth-httplib2>=0.1.0,<0.2.0' in https://github.com/apache/beam/blob/master/sdks/python/setup.py#L445 .
Reopening since the workflow is still flaky
Reopening since the workflow is still flaky
من مبتدی هستم و زیاد چیزی نمی دانم و می خواهم یاد بگیرم
در تاریخ چهارشنبه ۲۱ اوت ۲۰۲۴، ۱۳:۰۴ github-actions[bot] < @.***> نوشت:
Reopened #30960 https://github.com/apache/beam/issues/30960.
— Reply to this email directly, view it on GitHub https://github.com/apache/beam/issues/30960#event-13957494517, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5VBXF3REVJOQ5ZJMLVS4N3ZSRNKVAVCNFSM6AAAAABGFRSTSGVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJTHE2TONBZGQ2TCNY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
a random bigtableio_it_test test failing a time, looks like the tests having racing condition running in parallel on same machine (port occupied?)
Reopening since the workflow is still flaky
Reopening since the workflow is still flaky
Reopening since the workflow is still flaky
Stabilized
Reopening since the workflow is still flaky
Python 3.13 tests failing due to _InactiveRpcError:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNKNOWN
details = "Application error processing RPC"
debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Application error processing RPC", grpc_status:2}"
>
self = <apache_beam.io.gcp.bigtableio_it_test.TestWriteToBigtableXlangIT testMethod=test_set_mutation>
...
> self.run_pipeline([row1, row2])
> raise _InactiveRpcError(state) # pytype: disable=not-instantiable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
E status = StatusCode.UNKNOWN
E details = "Application error processing RPC"
E debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"Application error processing RPC", grpc_status:2}"
E >
../../build/gradleenv/1922375555/lib/python3.13/site-packages/grpc/_channel.py:996: _InactiveRpcError
https://github.com/apache/beam/runs/52803653779
breaking since Oct 3, likely after switching to Python 3.13 (#35056) cc: @jrmccluskey @tvalentyn what could be causing grpc error in Python 3.13 alone?
There's a bug in grpc 1.66.0+ that can cause timeouts, only Python 3.13 uses a version beyond this out of necessity (prior releases do not support 3.13.) #36525 includes experiments that are supposed to mitigate this problem and drops them into our dockerfile, which will hopefully make the 3.13 tests mores stable
I tested locally (command: pytest -v -s apache_beam/io/gcp/bigtableio_it_test.py::TestWriteToBigtableXlangIT::test_set_mutation --test-pipeline-options="--runner=TestDirectRunner --project=apache-beam-testing"). I get the same error while my machine missing docker. Likely this is simply due to transform service not successfully turned up
there is no hint what happened for GitHub Action due to that the output of transform service has been redirected:
https://github.com/apache/beam/blob/d687f4fe8170b6eb4c82e02419702d5a20eb456e/sdks/python/scripts/run_transform_service.sh#L79
This is most likely an infra issue (tests passed on 3.9 variant but not 3.13) so move off release blocker, however we still need to fix the test as a PostCommit. To investigate One may need to remove the stdout/stderr redirect, or upload the log file at the end of the workflow to see what happened for the transform service @Amar3tto @aIbrahiim
Haven't reproduced it 100% locally, but in local it shows the following error:
RuntimeError: The grpc package installed is at version 1.65.5, but the generated code in org/apache/beam/model/pipeline/v1/standard_window_fns_pb2_grpc.py depends on grpcio>=1.71.0. Please upgrade your grpc module to grpcio>=1.71.0 or downgrade your generated code using grpcio-tools<=1.65.5.
Resolve mutations for :sdks:python:test-suites:direct:xlang:fnApiJobServerCleanup (Thread[#194,Execution worker Thread 21,5,main]) started.
:sdks:python:test-suites:direct:xlang:fnApiJobServerCleanup (Thread[#194,Execution worker Thread 21,5,main]) started.
what I suspect is that the test compiles grpc code under Python 3.9, but then run Beam at Python 3.13, causing similar conflict. This explains why Python 3.12 used to work but the test started failing since Python 3.13
@aIbrahiim @jrmccluskey can we partly revert TransformService test change in #35056 to make it run on Python 3.9 and 3.12, until all Python versions can have a shared, and working grpc ?
I see that as a fault with the test. We can't necessarily control when different python versions will require different dependency versions (and unless the python foundation changes the release cadence for versions we will be trying to support 4-5 python versions simultaneously for the foreseeable future) but we can make sure that our tests match what users get when they install beam at a specific version.
In short, I'd prefer that we have the 3.13 version of the test re-build grpc with python 3.13