beam
beam copied to clipboard
Apache Beam is a unified programming model for Batch and Streaming data processing.
Until this is better supported, we should warn folks away from it. See https://github.com/apache/beam/issues/31078 for a collection of issues ------------------------ Thank you for your contribution! Follow this checklist to help...
This PR expands the refcount lease on the underlying Bigtable client from Start/StopBundle to the first StartBundle until Teardown. The previous behavior had a lot client & connection churn when...
Adding an example for creating SchemaTransforms and using them with the ExternalTransformProvider API
Improve the error message for DirectRunner. ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Mention the appropriate...
**Please** add a meaningful description for your change here ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ]...
This is a 2nd PR in the series of PRs (based on the now closed #31648, with the first PR being #31785) whose ultimate goal is to add support for...
This PR contains several optimizations for the Datastream API when used in Batch mode. In it's current state, using Datastream for batch is much slower than Dataset, this is an...
This is a follow-up PR to #31953, and part of the issue #31905. This PR adds the actual writer functionality, and some additional testing, including integration testing. This should be...
The PostCommit TransformService Direct is failing over 50% of the time Please visit https://github.com/apache/beam/actions/workflows/beam_PostCommit_TransformService_Direct.yml?query=is%3Afailure+branch%3Amaster to see the logs.
Check Argument to check if redistribute can be enabled was incorrect Two ways to enable commits, 1) explicitly via commitOffsetsInFinalize() 2) Via consumer config ENABLE_AUTO_COMMIT_CONFIG=true If the first is true,...