kinesis-sql icon indicating copy to clipboard operation
kinesis-sql copied to clipboard

Multiple Sinks

Open pekzeki opened this issue 5 years ago • 1 comments

Hi guys,

Does the library support multiple sinks?

I am using two sinks (KinesisSink & FileSink) in the same application. Both sinks use their own checkpoint locations.

  1. Read a stream.
  2. Apply some transformations on the DataFrame.
  3. writeStream to a FileSink.
  4. writeStream to a KinesisSink.
  5. spark.streams.awaitAnyTermination()

However, I encountered weird behaviors like

  • slow processing on KinesisSink
  • jumping iterator_age on Kinesis stream metrics
  • data loss

Could you give me some information about the functionalities of the library on multiple sinks?

Thanks.

pekzeki avatar Mar 12 '19 12:03 pekzeki

Can you elaborate more on the observed behavior - data loss/jumping iterator_age?

The library was not designed keeping multiple sinks in mind. We can take that as a new feature ask. Please feel free to work on this functionality and start a PR.

itsvikramagr avatar Mar 14 '19 10:03 itsvikramagr