bigdata topic
gearpump
Lightweight real-time big data streaming engine over Akka
SparkRDMA
This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvidia/sparkucx
splash
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
ECommerceRecommendSystem
商品大数据实时推荐系统。前端:Vue + TypeScript + ElementUI,后端 Spring + Spark
ldetool
Code generator for fast log file parsers
bigartm
Fast topic modeling platform
bigdata-file-viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
spark-movie-lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset