beam
beam copied to clipboard
Apache Beam is a unified programming model for Batch and Streaming data processing.
Add a DirectCommitWorkStream that has a self contained commit queue, and commit thread. Callers will call `queueCommit` on the stream, and and commits will be batched and sent to windmill...
### What needs to happen? This is not a trivial dependency upgrade. slf4j 2.x uses a different binding mechanism makes it incompatible with SLF4J 1.x. Arrow has a compile dependency...
**Please** add a meaningful description for your change here ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ]...
[Task]: Improve how to handle the Dataflow-specific option `impersonateServiceAccount` for Beam Java
### What needs to happen? `impersonateServiceAccount` should be kept when submitting Dataflow jobs but should be removed when creating Dataflow workers per [the design](https://docs.google.com/document/d/13KRYiq5JAcs-leznzXI_knvqp7ud0u3YASVqK-yMeQw/edit#heading=h.18gu8586i6j1). To fix this, #30283 put a...
Adding 'golden' prompts for Duet AI training: AI/ML data processing prompts ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: -...
Add unit tests for yaml_provider.py. This PR also introduces non-breaking name changes to functions and variables to suppress IDE warnings. There is also a change to the `WithSchema` built-in transform...
Adds ability to omit language parameter in Filter transform config. Replaces #29751 ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:...
### What needs to happen? Currently the MLTransform tests are included in a benchmark suite, which might reduce visibility of test failures MLTransform functionality. ### Issue Priority Priority: 3 (nice-to-have...
### What would you like to happen? Analogous to `Comparator.comparing()`. ### Issue Priority Priority: 3 (nice-to-have improvement) ### Issue Components - [ ] Component: Python SDK - [X] Component: Java...
The underlying issue with attempt timeout is fixed in the Bigtable client. We don't need to pass around operation and attempt timeout in the reader anymore. This PR cleans up...