beam
beam copied to clipboard
Apache Beam is a unified programming model for Batch and Streaming data processing.
ReadFromCsv with an explicit dtype produced graphs that had quadratic traversal (though the computed results, sets, were always correct). This fixes https://github.com/apache/beam/issues/31152 and should help other deep expressions with common...
### What happened? Teardown is never called when running locally with prism or on Dataflow. Tested signature: Teardown(ctx context.Context) error ### Issue Priority Priority: 3 (minor) ### Issue Components -...
Develop Histogram combiner and a transform that efficiently constructs linear, exponential or explicit histograms from large datasets of input data within an Apache Beam pipeline. Also, another objective is that...
### What happened? We are attempting to use the STORAGE_WRITE_API with exactly-once guarantees in our pipelines running on Runner V2. Our configuration uses dynamic destinations and auto sharding, as detailed...
### What happened? In https://github.com/apache/beam/blob/611676d108b26ee378a2b0c128c855017d162772/sdks/python/apache_beam/io/gcp/bigquery.py#L286, the documentation states that the chain operation for file loads is ```return (result.load_jobid_pairs, result.copy_jobid_pairs) | beam.Flatten()``` In fact, according to https://github.com/apache/beam/blob/611676d108b26ee378a2b0c128c855017d162772/sdks/python/apache_beam/io/gcp/bigquery.py#L2230 and https://github.com/apache/beam/blob/611676d108b26ee378a2b0c128c855017d162772/sdks/python/apache_beam/io/gcp/bigquery.py#L2234, it should...
### What would you like to happen? Known breaking change - https://github.com/FasterXML/jackson-core/issues/863 causing ``` com.fasterxml.jackson.core.exc.StreamConstraintsException: String length (5046272) exceeds the maximum length (5000000) at com.fasterxml.jackson.core.StreamReadConstraints.validateStringLength(StreamReadConstraints.java:290) at com.fasterxml.jackson.core.util.ReadConstrainedTextBuffer.validateStringLength(ReadConstrainedTextBuffer.java:27) at com.fasterxml.jackson.core.util.TextBuffer.finishCurrentSegment(TextBuffer.java:931) ```...
### What would you like to happen? Spanner provides a set of built-in statistics tables to help you gain insight into your queries, reads, and transactions. To correlate statistics with...
Performance change found in the test: `pytorch_image_classification_benchmarks-resnet152-GPU-mean_load_model_latency_milli_secs` for the metric: `mean_load_model_latency_milli_secs`. For more information on how to triage the alerts, please look at `Triage performance alert issues` section of the...
Performance change found in the test: `pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_latency_micro_secs` for the metric: `mean_inference_batch_latency_micro_secs`. For more information on how to triage the alerts, please look at `Triage performance alert issues` section of the...
### What happened? Hello! Sorry if this is a duplicate. To be honest, I don't know much about Apache Beam and Dataflow, so I am still learning and might be...