tremor-runtime icon indicating copy to clipboard operation
tremor-runtime copied to clipboard

Add AWS S3 Connector with Streaming support

Open mfelsche opened this issue 3 years ago • 6 comments

Describe the problem you are trying to solve

It is very common in event processing to stream data to some kind of persistent storage engine for later processing or archiving purposes. One very prominent storage engine is AWS S3.

A common practice is to stream data into files that aggregate across a time window (e.g. 1 hour) or that accumulate a certain number of events or grow to a certain size. An AWS S3 connector should support this style of streaming.

Describe the solution you'd like

We would like to have an AWS S3 Connector that enables tremor to read S3 objects in a streaming fashion (source-part of the connector) and to write data to S3 objects also in a streaming fashion (sink-part of the connector).

It should support all the common ways of authentication to AWS and maintain authentication across the whole lifetime of the connector (e.g. through token refresh etc.).

It should use the official Rust SDK: https://github.com/awslabs/aws-sdk-rust

mfelsche avatar Aug 11 '21 09:08 mfelsche

Hey @mfelsche, This project seems very interesting to me so I will like to work on this as part of the LFX'21 Mentorship program. Thank you.

rahul799 avatar Aug 16 '21 07:08 rahul799

Nice!

Please apply via the LFX site once it appears here as a mentorship. This might take some days: https://mentorship.lfx.linuxfoundation.org/#projects_accepting

We will handle it from there. Here is a tutorial-like guide from the LFX on how to apply: https://docs.linuxfoundation.org/lfx/mentorship/mentees/apply-to-a-project

mfelsche avatar Aug 16 '21 07:08 mfelsche

Hi @mfelsche .

  • The awk-sdk in currently available as alpha and I donot see any roadmap for the official release yet. Does the support has to be experimental right now?
  • smithy-rs (the code generation tool for the sdk) does not produce runtime agnostic code. So are we expected to contribute this to their sdk as well. I will have to check whether they are accepting contribution for smithy-rs. I just hope this is straightforward.

dak-x avatar Aug 16 '21 07:08 dak-x

Hi @dak-x it would be cool to not rely on the tokio runtime that is used for the rust aws-sdk, but changing the codegen tool for the aws-sdk (smithy-rs) is not a requirement. It would be wicked cool, nontheless 😎

mfelsche avatar Aug 16 '21 07:08 mfelsche

Also i wouldnt worry about the SDK being experimental. This is fine!

mfelsche avatar Aug 16 '21 07:08 mfelsche

Hi @mfelsche,

I have applied this mentorship program via LFX, this project seems interesting. I got some experience with some other Object Storage Service like Aliyun OSS. Hope I could get the opportunity to work on this project.

Thanks,

OliverShang avatar Aug 29 '21 08:08 OliverShang

This is done

mfelsche avatar Sep 21 '22 07:09 mfelsche