bigdata topic

List bigdata repositories

hudi

5.1k
Stars
2.4k
Forks
1.2k
Watchers

Upserts, Deletes And Incremental Processing on Big Data.

chunjun

3.9k
Stars
1.7k
Forks
Watchers

A data integration framework

sql-generator

3.4k
Stars
699
Forks
Watchers

🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~

dpark

2.7k
Stars
535
Forks
Watchers

Python clone of Spark, a MapReduce alike framework in Python

volcano

3.8k
Stars
883
Forks
Watchers

A Cloud Native Batch System (Project under CNCF)

avro

2.8k
Stars
1.6k
Forks
Watchers

Apache Avro is a data serialization system.

flinkStreamSQL

2.0k
Stars
925
Forks
Watchers

基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法

BigDataGuide

2.5k
Stars
844
Forks
Watchers

大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料

byzer-lang

1.8k
Stars
545
Forks
Watchers

Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.

spark-py-notebooks

1.6k
Stars
911
Forks
Watchers

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks