bigdata topic
big-data-rosetta-code
Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
kotlin-spark-api
This projects gives Kotlin bindings and several extensions for Apache Spark. We are looking to have this as a part of Apache Spark 3.x
shifu
An end-to-end machine learning and data mining framework on Hadoop
fpart
Sort files and pack them into partitions
spark-r-notebooks
R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
amoro
Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
flink-tutorials
Flink Tutorial Project
WeDataSphere
WeDataSphere is a financial grade, one-stop big data platform suite.