streamx icon indicating copy to clipboard operation
streamx copied to clipboard

Add support for GCS

Open alunarbeach opened this issue 8 years ago • 3 comments

Add support for Google Cloud Storage.

alunarbeach avatar Jan 29 '17 03:01 alunarbeach

Hi @alunarbeach This is in our roadmap and will be worked on soon. Meanwhile, its not hard to get it working. You will need these steps. Try it out if you need to hack something quickly (will be great if you can contribute too)

  • Add GCS Maven dependencies
  • Google has their own FileSystem impl for GCS. Look at https://cloud.google.com/hadoop/google-cloud-storage-connector
  • You need to change hdfs-site.xml to include GCS specific properties and authentication details.

With the above steps, you must be able to get streamx to write to GCS.

PraveenSeluka avatar Feb 01 '17 18:02 PraveenSeluka

@alunarbeach I have added GCS support, tested and pushed the changes. Look at this commit https://github.com/qubole/streamx/commit/de065fee48ff1a9cabd8e268318c8f4d99d47718. Please try it out and let us know if you see any issues.

Provide GS destination in "s3.url" config itself. Will refactor StreamX later. (You will have to use S3SinkConnector as if you are using S3 itself and just provide GS location in s3.url).

Look at https://github.com/qubole/streamx/blob/master/config/hadoop-conf/hdfs-site.xml for sample config file.

Thanks

PraveenSeluka avatar Feb 08 '17 21:02 PraveenSeluka

Thanks @PraveenSeluka. will try this out soon.

alunarbeach avatar Feb 10 '17 04:02 alunarbeach