Create PubSubIO Load test
This PR adds Load tests for PubSubIO. There are several configurations depending on the amount of data and the runner where the tests will be executed:
- local: Data volume is 0.2 MB. The test runs on the local machine using the Direct Runner.
- medium: Data volume is 10 GB. The test runs on the Dataflow runner.
- large: Data volume is 100 GB. The test runs on the Dataflow runner.
How to verify the results:
- Install Gradle on your local machine.
- Run the command
gradle :it:google-cloud-platform:PubSubPerformanceTest --tests "org.apache.beam.it.gcp.pubsub.PubSubIOLT" -dfrom the project root to execute the tests. - The tests will pass successfully if all data is processed within a certain amount of time and the processed data matches the expected amount.
- Additionally, upon running the test, a link for monitoring the job will be displayed in the terminal, providing more detailed information on how much data has been processed and the progress. An example link is: https://console.cloud.google.com/dataflow/jobs/us-central1/2024-02-13_00_10_16-5567586527775732420?project=apache-beam-testing.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
- [ ] Mention the appropriate issue in your description (for example:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead. - [ ] Update
CHANGES.mdwith noteworthy changes. - [ ] If this contribution is large, please file an Apache Individual Contributor License Agreement.
See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.
R: @damccorm @Abacn
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control
Thanks. Had a couple of initial comments. Is there a job id available for demonstration?
Here is the links to one of the tests: Write job: https://console.cloud.google.com/dataflow/jobs/us-central1/2024-02-21_02_10_37-12171205265624652554;step=Read%20from%20source;mainTab=JOB_GRAPH;bottomTab=JOB_ERROR_REPORTING;bottomStepTab=DATA_SAMPLING;logsSeverity=INFO;graphView=0?project=apache-beam-testing&pageState=(%22dfTime%22:(%22l%22:%22dfJobMaxTime%22)) Read job: https://console.cloud.google.com/dataflow/jobs/us-central1/2024-02-21_02_11_03-1874843895021090331;step=Read%20from%20PubSub;mainTab=JOB_GRAPH;bottomTab=DATA_SAMPLING;logsSeverity=INFO;graphView=0?project=apache-beam-testing