beam icon indicating copy to clipboard operation
beam copied to clipboard

[BEAM-13004] DebeziumIO Load Test

Open roger-mike opened this issue 2 years ago • 28 comments


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • [ ] Choose reviewer(s) and mention them in a comment (R: @username).
  • [ ] Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • [ ] Update CHANGES.md with noteworthy changes.
  • [ ] If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels Python tests Java tests

See CI.md for more information about GitHub Actions CI.

roger-mike avatar Jul 19 '22 16:07 roger-mike

Codecov Report

Merging #22344 (e651b30) into master (dfa5ec5) will increase coverage by 0.31%. The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #22344      +/-   ##
==========================================
+ Coverage   74.09%   74.40%   +0.31%     
==========================================
  Files         712      717       +5     
  Lines       93832    97173    +3341     
==========================================
+ Hits        69526    72306    +2780     
- Misses      23026    23587     +561     
  Partials     1280     1280              
Flag Coverage Δ
python 83.51% <ø> (-0.02%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/python/apache_beam/typehints/__init__.py 77.77% <0.00%> (-22.23%) :arrow_down:
...ks/python/apache_beam/runners/worker/statecache.py 91.95% <0.00%> (-4.20%) :arrow_down:
sdks/python/apache_beam/internal/pickler.py 92.85% <0.00%> (-2.60%) :arrow_down:
...ks/python/apache_beam/runners/worker/data_plane.py 87.57% <0.00%> (-1.70%) :arrow_down:
...examples/inference/sklearn_mnist_classification.py 42.30% <0.00%> (-1.45%) :arrow_down:
sdks/python/apache_beam/runners/direct/executor.py 96.46% <0.00%> (-0.55%) :arrow_down:
...python/apache_beam/runners/worker/worker_status.py 79.32% <0.00%> (-0.39%) :arrow_down:
sdks/python/apache_beam/typehints/schemas.py 94.06% <0.00%> (-0.09%) :arrow_down:
sdks/python/apache_beam/portability/common_urns.py 100.00% <0.00%> (ø)
...thon/apache_beam/ml/inference/pytorch_inference.py 0.00% <0.00%> (ø)
... and 40 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Jul 19 '22 17:07 codecov[bot]

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @AnandInguva for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

github-actions[bot] avatar Jul 19 '22 19:07 github-actions[bot]

Run Seed Job

damccorm avatar Jul 21 '22 15:07 damccorm

@roger-mike the seed job is failing because of an issue that has been fixed in the main repo (#22223) could you please update your branch with the latest changes from master?

damccorm avatar Jul 21 '22 16:07 damccorm

@damccorm done 👍 , could you run the seed job again?

roger-mike avatar Jul 22 '22 15:07 roger-mike

Done - https://ci-beam.apache.org/job/beam_SeedJob/10017/console

damccorm avatar Jul 22 '22 16:07 damccorm

Hi @damccorm, the job appears as disabled, could you enable it and run the seed job again? Thank you. https://ci-beam.apache.org/job/beam_PerformanceTests_Debezium/

roger-mike avatar Jul 25 '22 16:07 roger-mike

@roger-mike Your seed job might have overwritten by the periodically running seed job(it runs every 6 hours)

AnandInguva avatar Jul 25 '22 16:07 AnandInguva

Run Python Debezium Performance Test

damccorm avatar Jul 25 '22 17:07 damccorm

I reran the seed job and started https://ci-beam.apache.org/job/beam_PerformanceTests_Debezium/1/. The seed job will run automatically again soon overwriting this.

damccorm avatar Jul 25 '22 17:07 damccorm

Run Python Debezium Performance Test

roger-mike avatar Jul 25 '22 18:07 roger-mike

Hi @damccorm, could you run the seed job again? Thank you

roger-mike avatar Jul 27 '22 16:07 roger-mike

Hi, @damccorm any idea why this error happens when running the job? https://ci-beam.apache.org/job/beam_PerformanceTests_Debezium/2/console

13:02:05 Traceback (most recent call last):
13:02:05   File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
13:02:05     return _run_code(code, main_globals, None,
13:02:05   File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
13:02:05     exec(code, run_globals)
13:02:05   File "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_Debezium/src/sdks/python/apache_beam/testing/load_tests/debezium_performance.py", line 190, in <module>
13:02:05     debeziumTest.createPipeline()
13:02:05   File "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_Debezium/src/sdks/python/apache_beam/testing/load_tests/debezium_performance.py", line 156, in createPipeline
13:02:05     pipeline | 'Read from debezium' >> ReadFromDebezium(
13:02:05   File "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_Debezium/src/sdks/python/apache_beam/io/debezium.py", line 168, in __init__
13:02:05     self.expansion_service = expansion_service or default_io_expansion_service()
13:02:05   File "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_Debezium/src/sdks/python/apache_beam/io/debezium.py", line 98, in default_io_expansion_service
13:02:05     return BeamJarExpansionService(
13:02:05   File "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_Debezium/src/sdks/python/apache_beam/transforms/external.py", line 820, in __init__
13:02:05     path_to_jar = subprocess_server.JavaJarServer.path_to_beam_jar(
13:02:05   File "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_Debezium/src/sdks/python/apache_beam/utils/subprocess_server.py", line 243, in path_to_beam_jar
13:02:05     raise RuntimeError(
13:02:05 RuntimeError: /home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_Debezium/src/sdks/java/io/debezium/expansion-service/build/libs/beam-sdks-java-io-debezium-expansion-service-2.41.0-SNAPSHOT.jar not found. Please build the server with 
13:02:05   cd /home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_Debezium/src; ./gradlew sdks:java:io:debezium:expansion-service:shadowJar

roger-mike avatar Jul 29 '22 20:07 roger-mike

I'm not immediately sure - I don't have a ton of area expertise here, it certainly looks like a dependency issue though.

Unfortunately, I'm going on vacation in <1 hour, so I probably won't be able to work through this one with you - @TheNeuralBit would you mind helping out here with any seed jobs that need to be run (and maybe some ideas on what might be happening)?

damccorm avatar Jul 29 '22 20:07 damccorm

Sure I can help here. Unfortunately I'm also going to have limited availability this afternoon (PDT), but I'll be around next week.

In this case this looks like a situation where we the test depends on a Java dependency, we need to add an annotation for that in the gradle file. Let me see if I can find an example of that..

TheNeuralBit avatar Jul 29 '22 20:07 TheNeuralBit

Here is one example, although it doesn't translate directly. When making the XVR (Cross-language validates runner) suites we add a dependency on the necessary expansion service: https://github.com/apache/beam/blob/e5e3cb25ca4fc2e31c10eb3dbda8289c6bfc7140/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L2292

TheNeuralBit avatar Jul 29 '22 20:07 TheNeuralBit

Hi @TheNeuralBit could you run the seed job again? Thanks

roger-mike avatar Aug 03 '22 20:08 roger-mike

Run Seed Job

TheNeuralBit avatar Aug 03 '22 21:08 TheNeuralBit

Sorry I forgot I had to run the job manually for non-committers. It's finished now: https://ci-beam.apache.org/job/beam_SeedJob/10079/

TheNeuralBit avatar Aug 03 '22 23:08 TheNeuralBit

Run Python Debezium Performance Test

TheNeuralBit avatar Aug 03 '22 23:08 TheNeuralBit

Hi @TheNeuralBit could you run the seed job again? Thanks

roger-mike avatar Aug 05 '22 21:08 roger-mike

Seed job failed with:

ERROR: startup failed:
job_PerformanceTests_Debezium.groovy: 22: unable to resolve class LoadTestBuilder
 @ line 22, column 1.
   import LoadTestBuilder

https://ci-beam.apache.org/job/beam_SeedJob/10108/console

I saw that there's also https://github.com/apache/beam/pull/22135 for this task, do we need both?

TheNeuralBit avatar Aug 08 '22 19:08 TheNeuralBit

Seed job failed with:

ERROR: startup failed:
job_PerformanceTests_Debezium.groovy: 22: unable to resolve class LoadTestBuilder
 @ line 22, column 1.
   import LoadTestBuilder

https://ci-beam.apache.org/job/beam_SeedJob/10108/console

I saw that there's also #22135 for this task, do we need both?

Sorry, just a typo. We'll close #22135, is not needed anymore. Could you run the seed job again? Thank you.

roger-mike avatar Aug 09 '22 14:08 roger-mike

Another failure:

Processing DSL script .test-infra/jenkins/job_PerformanceTests_Debezium.groovy
ERROR: (job_PerformanceTests_Debezium.groovy, line 63) No such property: LoadTestBuilder for class: javaposse.jobdsl.dsl.helpers.step.GradleContext
Sending e-mails to: [email protected]
Finished: FAILURE

https://ci-beam.apache.org/job/beam_SeedJob/10112

TheNeuralBit avatar Aug 09 '22 16:08 TheNeuralBit

Another failure:

Processing DSL script .test-infra/jenkins/job_PerformanceTests_Debezium.groovy
ERROR: (job_PerformanceTests_Debezium.groovy, line 63) No such property: LoadTestBuilder for class: javaposse.jobdsl.dsl.helpers.step.GradleContext
Sending e-mails to: [email protected]
Finished: FAILURE

https://ci-beam.apache.org/job/beam_SeedJob/10112

Fixed, can you run it again?

roger-mike avatar Aug 09 '22 16:08 roger-mike

Run Python Debezium Performance Test

TheNeuralBit avatar Aug 09 '22 17:08 TheNeuralBit

Done!

TheNeuralBit avatar Aug 09 '22 17:08 TheNeuralBit

@TheNeuralBit There was an error when running the Kubernetes part, maybe because the path had a \n. Could you run it again?

roger-mike avatar Aug 09 '22 17:08 roger-mike

Run Python Debezium Performance Test

TheNeuralBit avatar Aug 11 '22 19:08 TheNeuralBit

If we're to the point where you're iterating on Python code and not the job definition, we should probably find another way to iterate, either:

  • Can you run the test locally? Maybe we need to get you access to some GCP resources? or
  • We could go ahead and merge the new job with it failing, then you can iterate yourself

TheNeuralBit avatar Aug 11 '22 19:08 TheNeuralBit