bigdata topic
hudi
Upserts, Deletes And Incremental Processing on Big Data.
chunjun
A data integration framework
sql-generator
🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~
dpark
Python clone of Spark, a MapReduce alike framework in Python
volcano
A Cloud Native Batch System (Project under CNCF)
avro
Apache Avro is a data serialization system.
flinkStreamSQL
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
BigDataGuide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
byzer-lang
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
spark-py-notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks