beam
beam copied to clipboard
Apache Beam is a unified programming model for Batch and Streaming data processing.
As part of the migration of Precommit and Postcommit Jobs from Jenkins to GA in self-hosted runners, this PR contains: - Migrated and sharded workflow [job-postcommit-go-vr-samza.yml]() - Migrated and sharded...
addresses #21022 In the section "Using Schema Transforms" of the Python programming guide, there are missing examples. I've written the examples for top-level fields, nested fields and wildcards **Please** add...
Improved pipeline translation in `SparkStructuredStreamingRunner` (closes #22445, #22382): - Make use of Spark `Encoder`s to leverage structural information in translation (and potentially benefit from Catalyst optimizer). Though note, the possible...
### What needs to happen? Investigate and address following issues related to infrastructure and backend deployment scripts: 1. Naming of git workflows should be consistent. Currently we have: build_playground_frontend.yaml build_playground_backend.yaml...
### What would you like to happen? Hi, folks, I'm using a SQLTransform to read & write a Date field to parquet files. However, AvroUtils.toAvroSchema can't convert org.apache.beam.sdk.schemas.logicaltypes.Date to a...
Described basic usage scenario and deployment ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and...
We developed a new IO named DataLakeIO, which support beam to read data from data lake (delta, iceberg, hudi), and write data to data lake(delta, icberg, hudi). Because delta ,...
### What needs to happen? Investigate any gaps in Playground codebase, infrastructure and UI regarding cross language pipelines. Need to prepare a document with changes required in frontend, backend and...
### What needs to happen? We need to replace hardcoded values for content tree and learning materials with calls to real backend endpoints. ### Issue Priority Priority: 3
Make separate Playground CI/CD steps for Playground and Tour Of Beam examples. For that, we introduce 2 command line parameters for ci_cd.py: * --dirs: a list of dirs to search...