DataflowJavaSDK
DataflowJavaSDK copied to clipboard
Additional Pipeline Example - Backing up Datastore -> GCS
Example code:
https://github.com/cobookman/DatastoreToGCS/blob/master/src/main/java/com/google/datastorebackup/Main.java
Had a customer ask how to do this. Found that it's not as trivial as it should be and might be a good example. Could also have the pipeline write to BigQuery as well. This is a not uncommon use case.
Let me know if you'd like me to submit a pull req.
I see that your converting to json using a custom converter, is there a reason why your not just using the canonical proto json format converter?
Didn't know that was a possibility, and couldn't find the specific method. How would one convert the Datastore Entity to Json through the proto json converter.
I think your looking for: https://github.com/google/protobuf/blob/master/java/util/src/main/java/com/google/protobuf/util/JsonFormat.java
Got JSONFormat working :D.
Was simple stupid do do Datastore->GCS and GCS->Datastore.
I couldn't find any online examples of JSONFormat being used so I think adding this example would go a long way. This also seems like a common use case.
Here's an example of JSONFormat being used to print the Datastore Entities: https://github.com/cobookman/DatastoreToGCS/blob/master/src/main/java/com/google/datastorebackup/Backup.java#L91
@cobookman This is a good candidate for Beam templates. I am working on adding ValueProvider options for DatastoreIO. Once that is in, I can help you translate your example into a template and get it merged to BEAM.
@vikkyrk. Would be more than happy to help get this example get merged into beam.