scio icon indicating copy to clipboard operation
scio copied to clipboard

A Scala API for Apache Beam and Google Cloud Dataflow.

Results 213 scio issues
Sort by recently updated
recently updated
newest added

Can someone please clarify how init function behaves while loading external data on workers using [Distributed cache](https://spotify.github.io/scio/examples/DistCacheExample.scala.html). In case of multi core machines normally Google Dataflow launches 1 thread per...

question ❓

`Memcached` is widely used for a lot of use cases. It would be nice if we could support it more seamlessly.

enhancement
good first issue
streaming

Something to execute after `sc.run()` with `ScioContext` & `PipelineResult`, for integration hooks i.e. submitting counters to a dashboard, lineage, or updating Bigtable cluster.

enhancement

Cont for https://github.com/spotify/scio/issues/3944 ;) Current implementation supplies top level fields to the BQ Storage API even if selected field is a record with only small subset of nested fields (quite...

To make ways for Scala 3 migration.

P2
refactoring

We are only leveraging Zoltar for model loading. Since we are not leveraging the other features maybe we can live without it. That said, let's see if we can remove...

enhancement
deprecation

enhancement
good first issue
streaming

Current ScioIO scaladoc does not include read/write methods (https://spotify.github.io/scio/api/com/spotify/scio/io/ScioIO.html) Besides that they do not have and documentation, it would be nice at least to include them in Scaladocs as they...