scio icon indicating copy to clipboard operation
scio copied to clipboard

A Scala API for Apache Beam and Google Cloud Dataflow.

Results 213 scio issues
Sort by recently updated
recently updated
newest added

to run successfully, https://spotify.github.io/scio//extras/Scio-REPL.html#bigquery-example requires setting templocatiom to a GCS path ``` sc.options.setTempLocation("gs://...") ```

documentation

From beam [documentation](https://beam.apache.org/documentation/io/built-in/google-bigquery/#setting-the-insertion-method) > When you specify load jobs as the insertion method using BigQueryIO.write().withMethod(FILE_LOADS). Scio should also give users the possibility to use files to load data to BigQuery

From beam BigQuery [documentation](https://beam.apache.org/documentation/io/built-in/google-bigquery/#writing-to-bigquery) > Starting with version 2.36.0 of the Beam SDK for Java, you can use the [BigQuery Storage Write API](https://cloud.google.com/bigquery/docs/write-api) from the BigQueryIO connector. Scio should also...

`bytes` is one of Avro's primitive types (https://avro.apache.org/docs/1.8.2/spec.html#schema_primitive), however it is currently not covered by tests

Depending on `scio-smb` pulls transitively all storage implementation dependencies for: - parquet - json - avro - tensorflow TensorFlow dependencies alone are ~200Mb. Users should only have the desired storage...

Scio does run with avro model defining logical types as defined in the [avro specification](https://avro.apache.org/docs/1.9.2/spec.html#Logical+Types). Avro 1.8 generates broken java code which does not allow full support of the feature...

saveAsElasticsearch() currently reshuffles all data into a fixed number of shards. What do you think about adding an option to avoid AssignToShard transform? For our usecase, reshuffle is adding unnecessary...

we should validate updates to the site by running `sbt mdoc` on relevant PRs

good first issue
build

https://spotify.github.io/scio/examples/TemplateExample.scala.html I referred to the above template as a guide for how to read/write to PubSub. We should change this template to use the Scio version of PubSubIO if that's...

documentation

We are starting to use scio-redis, and right now have made some modifications to the RedisDoFn for our needs, and we wanted have this added to scio-redis. - Support for...

enhancement