beam
beam copied to clipboard
Apache Beam is a unified programming model for Batch and Streaming data processing.
Currently, when using a file based source implementation to read data from files we have 2 output options: - read only the content of the each line of each file...
Change classes that explicitly inject a `MetricTrackingWindmillServerStub` to take in a Function. This will give flexibility in later refactoring as we apply different ways to fetch the data without MetricTrackingWindmillServerStub....
Removed flaky logic around waiting in tests. Removed thread.sleep and replaced with triggers via CountDownLatch R: @scwhittle @Abacn ------------------------ Thank you for your contribution! Follow this checklist to help us...
Support Flink 1.17. Closes #29939. ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Mention the appropriate issue...
Saves the submission environment dependencies and stage it. Logs it along with the runtime dependencies. Fixes #28563 ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate...
Handling DataStream, windowing, and more complex types will come in a future PR. ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and...
This PR closes #28930 with a PTransform implementation that throttles a PCollection without using any external resources i.e. an external database, queue, etc. Please see #28930 for further details on...
No need to submit or review will be thrown away once testing is complete. there will be smaller PRs with the changes. ------------------------ Thank you for your contribution! Follow this...
This pull request introduces stress tests for BigQueryIO, designed to assess the performance under various conditions. The stress tests simulate dynamic load increases and evaluate the behavior of BigQueryIO for...
### What happened? Sometimes, a global window side input takes too long to update on a Dataflow job. The automatic model refresh feature of RunInference uses a pattern `WatchFilePattern` which...