kinesis-sql
kinesis-sql copied to clipboard
Multiple Sinks
Hi guys,
Does the library support multiple sinks?
I am using two sinks (KinesisSink & FileSink) in the same application. Both sinks use their own checkpoint locations.
- Read a stream.
- Apply some transformations on the DataFrame.
- writeStream to a FileSink.
- writeStream to a KinesisSink.
- spark.streams.awaitAnyTermination()
However, I encountered weird behaviors like
- slow processing on KinesisSink
- jumping iterator_age on Kinesis stream metrics
- data loss
Could you give me some information about the functionalities of the library on multiple sinks?
Thanks.
Can you elaborate more on the observed behavior - data loss/jumping iterator_age?
The library was not designed keeping multiple sinks in mind. We can take that as a new feature ask. Please feel free to work on this functionality and start a PR.