beam
beam copied to clipboard
Apache Beam is a unified programming model for Batch and Streaming data processing.
### What happened? Seems like jar file resolution with Maven SNAPSHOT repo is somehow incorrect. Successful run: https://github.com/apache/beam/actions/runs/11307102323/job/31448448902 ``` [INFO] Archetype repository not defined. Using the one from [org.apache.beam:beam-sdks-java-maven-archetypes-examples:2.24.0] found...
Users were not able to create a custom BasicAuthSempClient due to some package-private fields. This PR fixes that and adds example on how to create and use a custom client....
### What needs to happen? Currently we have five versions of Flink support, when policy is three. Obviously we don't have to be rigid about policy. But we can probably...
### What happened? Tests are [failing](https://github.com/apache/beam/actions/workflows/beam_PreCommit_Yaml_Xlang_Direct.yml?query=branch%3Amaster+is%3Acompleted) after #32757 was merged Failing tests are: - `apache_beam/yaml/integration_tests.py::MapTest::test_Filter-generic_InlineProvider_MapToFields-generic_ExternalJavaProvider_1` - `apache_beam/yaml/integration_tests.py::MapTest::test_Filter-generic_ExternalJavaProvider_MapToFields-generic_ExternalJavaProvider_3` Error (reformatted for readability): ``` FAILED apache_beam/yaml/integration_tests.py::MapTest::test_Filter-generic_InlineProvider_MapToFields-generic_ExternalJavaProvider_1 - apache_beam.testing.util.BeamAssertException: Failed assert: [Row(element=100, named_field=100,...
`Combine.perKeyWithBucketing(childCombiner, numBuckets)` applies the child combiner to the PCollection using numberOfBuckets number of intermediate keys. This is a POC, sending it now to share and get early feedback. TODO: Add...
BigQueryIO : control StorageWrite parallelism in batch, by reshuffling before write on the number of streams set for BigQueryIO.write() using .withNumStorageWriteApiStreams(numStorageWriteApiStreams) * BigQueryIO .java - add documentation on how withNumStorageWriteApiStreams...
https://github.com/apache/beam/pull/32566 for reference. Will also update 2.58 and 2.90 accordingly. ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [...
The current YAML API refernce docs look like the following:  This PR adds formatting to make the docs look more like the Sphinx-themed PyDocs used by the Python SDK:...
# Summary Add [distroless](https://github.com/GoogleContainerTools/distroless) variants to existing Java SDK container images: - beam-sdk/beam_java8_sdk - beam-sdk/beam_java11_sdk - beam-sdk/beam_java17_sdk - beam-sdk/beam_java21_sdk # Description The [Publish Beam SDK Snapshots](https://github.com/apache/beam/blob/master/.github/workflows/beam_Publish_Beam_SDK_Snapshots.yml), and [build_release_candidate](https://github.com/apache/beam/blob/master/.github/workflows/build_release_candidate.yml) GitHub workflows...
# Summary Add [distroless](https://github.com/GoogleContainerTools/distroless) variants to existing Python SDK container images: - beam-sdk/beam_python3.9_sdk - beam-sdk/beam_python3.10_sdk - beam-sdk/beam_python3.11_sdk - beam-sdk/beam_python3.12_sdk # Description The [Publish Beam SDK Snapshots](https://github.com/apache/beam/blob/master/.github/workflows/beam_Publish_Beam_SDK_Snapshots.yml), [PreCommit Python Docker](https://github.com/apache/beam/blob/master/.github/workflows/beam_PreCommit_PythonDocker.yml), and...