beam icon indicating copy to clipboard operation
beam copied to clipboard

[Failing Test]: beam_PostCommit_XVR_Direct perma-red.

Open chamikaramj opened this issue 2 years ago • 1 comments

What happened?

Seems like Go x-lang integration tests (for example, DebeziumIO, JDBCIO) are flaky.

For example,

https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/4702/consoleFull

10:26:17 2023/10/10 17:25:26 🐳 Terminating container: 82fa13ab9230 10:26:17 containers.go:100: error terminating container: Error response from daemon: No such container: 82fa13ab92303fc01cc1db52f2cf67f1c9a7666cf3611cdd0bce58936fc8d232 10:26:17 --- FAIL: TestDebeziumIO_BasicRead (144.27s) 10:26:17 FAIL

We should probably convert this tests to use Prism runner to be stable.

Issue Failure

Failure: Test is flaky

Issue Priority

Priority: 2 (backlog / disabled test but we think the product is healthy)

Issue Components

  • [ ] Component: Python SDK
  • [ ] Component: Java SDK
  • [ ] Component: Go SDK
  • [ ] Component: Typescript SDK
  • [ ] Component: IO connector
  • [ ] Component: Beam YAML
  • [ ] Component: Beam examples
  • [ ] Component: Beam playground
  • [ ] Component: Beam katas
  • [ ] Component: Website
  • [ ] Component: Spark Runner
  • [ ] Component: Flink Runner
  • [ ] Component: Samza Runner
  • [ ] Component: Twister2 Runner
  • [ ] Component: Hazelcast Jet Runner
  • [ ] Component: Google Cloud Dataflow Runner

chamikaramj avatar Oct 12 '23 23:10 chamikaramj

At this point it appears to be hard failing for a few Python 3.11 related issues.

https://github.com/apache/beam/actions/runs/7803720204/job/21284093581

https://github.com/apache/beam/actions/workflows/beam_PostCommit_XVR_Direct.yml?query=is%3Afailure

:sdks:python:test-suites:direct:xlang:validatesCrossLanguageRunnerPythonUsingSql

 File "/runner/_work/beam/beam/build/gradleenv/1922375555/lib/python3.11/site-packages/apache_beam/transforms/trigger.py", line 1376, in process_elements
    if input_watermark > window.end + self.allowed_lateness:
                         ^^^^^^^^^^
AttributeError: 'bytes' object has no attribute 'end'

lostluck avatar Feb 06 '24 18:02 lostluck

This is a regression in Beam 2.53.0. Unfortunately GHA logs expires in 3 months. From now one only knows the regression happens between Nov 8, 2023 (last successful run and https://github.com/apache/beam/actions/runs/7018120946) - Dec 8, 2023 #2111 (first run see this issue and still has log)

Abacn avatar Mar 06 '24 19:03 Abacn

CC: @robertwb

Abacn avatar Mar 06 '24 20:03 Abacn

For some reason the window in WindowedValue decoded here

https://github.com/apache/beam/blob/1a05f39883fca49f8b8068a68a358dfe973055c0/sdks/python/apache_beam/runners/portability/fn_api_runner/execution.py#L238

is not a tuple of window objects, but a tuple of bytes e.g. (b"\x80\x00\x00\x00\x00\x00'\x10\x90N",), (b'\x80\x00\x00\x00\x00\x00N \x90N',), etc


The WIndowedValueCoder with BytesCoder as its window coder is constructed here: https://github.com/apache/beam/blob/1a05f39883fca49f8b8068a68a358dfe973055c0/sdks/python/apache_beam/coders/coders.py#L392

so it's using the information from proto to construct the coder.

Abacn avatar Mar 06 '24 20:03 Abacn