spark-structured-streaming-examples
spark-structured-streaming-examples copied to clipboard
Spark structured streaming examples with using of version 3.5.1
Spark Structured Streaming Examples
Spark structured streaming examples with using of version 3.4.0
Support matrix for joins in streaming queries
Left Input | Right Input | Join Type | Example |
---|---|---|---|
Static | Static | All types | TBD |
Stream | Static | Inner | TBD |
Left Outer | TBD | ||
Right Outer | Not supported | ||
Full Outer | Not supported | ||
Left Semi | TBD | ||
Static | Stream | Inner | TBD |
Left Outer | Not supported | ||
Right Outer | TBD | ||
Full Outer | Not supported | ||
Left Semi | Not supported | ||
Stream | Stream | Inner | ..streamstream.InnerJoinApp*, ..streamstream.InnerJoinWithWatermarkingApp* |
Left Outer | ..streamstream.LeftOuterJoinWithWatermarkingApp* | ||
Right Outer | TBD | ||
Full Outer | TBD | ||
Left Semi | TBD | ||
*Base package: com.phylosoft.spark.learning.sql.streaming.operations.join |
Use cases of processing modes (Triggers modes)
- Unspecified (default);
- Fixed interval micro-batches;
- One-time micro-batch (deprecated);
- Available-now micro-batch;
- Continuous with fixed checkpoint interval (experimental);
Optimizations
- Tungsten execution engine;
- Catalyst query optimizer;
- Cost-based optimizer;
Structured Sessionization
- KeyValueGroupedDataset.mapGroupsWithState;
- KeyValueGroupedDataset.flatMapGroupsWithState;
Links
- Structured Streaming Programming Guide;
- Stream-Stream Joins using Structured Streaming (Scala);
- Easy, Scalable, Fault-Tolerant Stream Processing with Structured Streaming in Apache Spark;
- Easy, Scalable, Fault-Tolerant Stream Processing with Structured Streaming in Apache Spark - continues;
- Deep Dive into Stateful Stream Processing in Structured Streaming;
- Monitoring Structured Streaming Applications Using Web UI;
- The Internals of Spark Structured Streaming;