spark-structured-streaming topic

List spark-structured-streaming repositories

streaming-sales-generator

43
Stars
20
Forks
Watchers

Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python

reddit-streaming-pipeline

82
Stars
9
Forks
Watchers

A real-time reddit data streaming pipeline for sentiment analysis of various subreddits

This repository includes supervised and unsupervised machine learning methods which are used to detect anomalies on network datasets. Decision Tree, Random Forest, Gradient Boost Tree, Naive Bayes, an...