etl-pipeline topic
spark-kinesis-redshift
Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark
csvplus
csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
pyspark-example-project
Implementing best practices for PySpark ETL jobs and applications.
hamilton
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
goodreads_etl_pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Udacity-Data-Engineering-Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
setl
A simple Spark-powered ETL framework that just works 🍺
watchmen-matryoshka-doll
Watchmen Platform is a low code data platform for data pipeline, meta data management , analysis, and quality management
incubator-streampark
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
metorikku
A simplified, lightweight ETL Framework based on Apache Spark