bigdata topic

List bigdata repositories

gearpump

763
Stars
153
Forks
Watchers

Lightweight real-time big data streaming engine over Akka

SparkRDMA

244
Stars
70
Forks
Watchers

This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvidia/sparkucx

splash

125
Stars
29
Forks
Watchers

Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange

ECommerceRecommendSystem

406
Stars
109
Forks
Watchers

商品大数据实时推荐系统。前端:Vue + TypeScript + ElementUI,后端 Spring + Spark

ldetool

315
Stars
25
Forks
Watchers

Code generator for fast log file parsers

TiBigData

210
Stars
57
Forks
Watchers

TiDB connectors for Flink/Hive/Presto

bigartm

662
Stars
117
Forks
Watchers

Fast topic modeling platform

bigdata_practice

244
Stars
50
Forks
Watchers

大数据分析可视化实践

bigdata-file-viewer

282
Stars
54
Forks
Watchers

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

spark-movie-lens

814
Stars
400
Forks
Watchers

An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset